Dietary intake, Nutritional status, and Health outcomes among Vegan, Vegetarian and Omnivore families: results from the observational study

Statistical report - mixed-effects models in all children


Authors and affiliations

Marina Heniková1,2, Anna Ouřadová1, Eliška Selinger1,3, Filip Tichanek4, Petra Polakovičová4, Dana Hrnčířová2, Pavel Dlouhý2, Martin Světnička5, Eva El-Lababidi5, Jana Potočková1, Tilman Kühn6, Monika Cahová4, Jan Gojda1


1 Department of Internal Medicine, Kralovske Vinohrady University Hospital and Third Faculty of Medicine, Charles University, Prague, Czech Republic.
2 Department of Hygiene, Third Faculty of Medicine, Charles University, Prague, Czech Republic.
3 National Health Institute, Prague, Czech Republic.
4 Institute for Clinical and Experimental Medicine, Prague, Czech Republic.
5 Department of Pediatrics, Kralovske Vinohrady University Hospital and Third Faculty of Medicine, Charles University, Prague, Czech Republic.
6 Department of Epidemiology, MedUni, Vienna, Austria.


This is a statistical report of the study currenlty under review in the Communications Medicine journal.

When using this code or data, cite the original publication:

TO BE ADDED

BibTex citation for the original publication:

TO BE ADDED


Original GitHub repository: https://github.com/filip-tichanek/kompas_clinical

Statistical reports can be found on the reports hub.

Data analysis is described in detail in the statistical methods report.


1 Introduction

This project is designed to evaluate and compare clinical outcomes across three distinct dietary strategy groups:

  • Vegans
  • Vegetarians
  • Omnivores

The dataset includes both adults and children, with data clustered within families.

1.1 Main Questions

The study addresses the following key questions:

Q1. Do clinical outcomes vary significantly across different diet strategies?

Q2. Beyond diet group, which factors (e.g., sex, age, breastfeeding status for children, or supplementation when applicable) most strongly influence clinical outcomes? How correlated (“clustered”) are these characteristics within the same family?

Q3. Could the clinical characteristics effectively discriminate between different diet groups?

1.2 Statistical Methods

For full methodological details, see this report. In brief:

  • Robust linear mixed-effects models (rLME) were used to estimate adjusted differences between diet groups (Q1) and assess the importance of other variables (Q2), including how much clinical characteristics tend to cluster within families. Covariates included age, sex, breastfeeding status for children, and relevant supplementation factors where applicable.

  • Elastic net logistic regression was employed to answer Q3, evaluating whether clinical characteristics provide a strong overall signal distinguishing between diet groups, incorporating a predictive perspective.

All analyses were conducted separately for adults and children.

2 Analysis

2.1 Import initiation file

Open code
getwd()
## [1] "/home/ticf/GitRepo/ticf/368_MOCA_kompas_clinical"
setwd('/home/ticf/GitRepo/ticf/368_MOCA_kompas_clinical/')
source('r/368_initiation.R')

2.2 Mixed models

All children models will be adjusted for the effect of log2_age (when applicable, i.e. not for age-adjusted Anthropometric data), SEX, aBreastFeed_full_stopped, aBreastFeed_total_stopped, and aBreastFeed_full_duration (the interaction of the latter with duration of the dependence on the breastfeeding). Additional covariates (relevant supplementation for vitamins and biogenic elements levels) or birth weight (anthropometric data) will be added as covariates.

Open code
AIC_child_all <- data.frame(outcome = NA,
                            estimand = 'mean',
                            log2_age = NA,
                            aSEX = NA,
                            Breast_Feed = NA,
                            other_cov = NA,
                            diet = NA,
                            family = NA)

diet_child_all <- data.frame(outcome = NA,
                         estimand = NA,
                         VN_OM_diff = NA, 
                         VN_OM_P = NA,
                         VG_OM_diff = NA, 
                         VG_OM_P = NA,
                         VN_VG_diff = NA, 
                         VN_VG_P = NA)

diet_child_all_non_robust <- data.frame(outcome = NA,
                         estimand = NA,
                         VN_OM_diff = NA, 
                         VN_OM_P = NA,
                         VG_OM_diff = NA, 
                         VG_OM_P = NA,
                         VN_VG_diff = NA, 
                         VN_VG_P = NA)

i = 1
j = 1
ni  <-  10

2.2.1 aMASS_Perc

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aBiW))

column_name
## [1] "aMASS_Perc"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
     SEX + 
     GRP +
     aBiW +
     s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               -12.700959  17.260547  -0.736    0.464    
## aBreastFeed_full_stopped  -11.910925  10.726816  -1.110    0.270    
## aBreastFeed_full_duration  -0.598803   1.059977  -0.565    0.573    
## aBreastFeed_total_stopped   6.307452   5.067418   1.245    0.216    
## SEXM                       -2.563793   4.216188  -0.608    0.545    
## GRPVG                      -3.468036   6.590112  -0.526    0.600    
## GRPVN                      -9.035730   5.717512  -1.580    0.117    
## aBiW                        0.022480   0.004126   5.448 3.97e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value   
## s(FAM) 36.66     92 0.756 0.00126 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.484   Deviance explained = 64.6%
## GCV = 631.04  Scale est. = 429.73    n = 140
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1  1.233    0.270
## aBreastFeed_full_duration  1  0.319    0.573
## aBreastFeed_total_stopped  1  1.549    0.216
## SEX                        1  0.370    0.545
## GRP                        2  1.331    0.269
## aBiW                       1 29.685 3.97e-07
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value
## s(FAM) 36.66  92.00 0.756 0.00126

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate   Std. Error    t value
## (Intercept)               -13.60584290 19.171806811 -0.7096797
## GRPVG                      -3.59757121  7.133137456 -0.5043463
## GRPVN                      -9.23123721  6.205851123 -1.4875054
## SEXM                       -2.38696398  4.702374700 -0.5076082
## aBreastFeed_full_stopped  -15.89580812 12.000091190 -1.3246406
## aBreastFeed_full_duration  -0.14178377  1.184237495 -0.1197258
## aBreastFeed_total_stopped   7.00262159  5.641911066  1.2411790
## aBiW                        0.02290339  0.004569268  5.0124847


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of age-standardized percentile of body mass. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of age-standardized percentile of body mass. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -9.231237 -21.39448 2.932008 0.1368814
VG vs OM -3.597571 -17.57826 10.383121 0.6140181
VN vs VG -5.633666 -18.20819 6.940856 0.3798854

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(exclude = c('log2_age'),
                    include = c('aBiW'))

res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme( exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('log2_age', 'SEX'), include = 'aBiW')

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('log2_age',
                             'aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'),  
                 include = 'aBiW')

### model without diet groups
mod_ndiet <- rlme(exclude = c('log2_age', 'GRP'), include = 'aBiW')

### model without birth weight
mod_other_cov <- rlme(exclude = c('log2_age'))

### model without log2_age
mod_log2_age <- NA

Putting key results together

Open code

## AIC
AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE)
diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.347
##   Unadjusted ICC: 0.266

i = i+1

2.2.2 aHEIGHT_Perc

Data selection and diagnostic plots

Open code

## Data selection and diagnostic plots
ni  <-  10

column_name <- names(dat_child_all)[i+ni]
column_name
## [1] "aHEIGHT_Perc"

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aBiW))

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aBiW +
    #s(log2_age)
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                -3.81188   16.73962  -0.228    0.820    
## aBreastFeed_full_stopped  -11.41481   10.26311  -1.112    0.269    
## aBreastFeed_full_duration   0.23064    1.01420   0.227    0.821    
## aBreastFeed_total_stopped   0.95070    4.89853   0.194    0.847    
## SEXM                       -0.34825    4.05298  -0.086    0.932    
## GRPVG                       0.02824    6.72575   0.004    0.997    
## GRPVN                      -8.70175    5.81139  -1.497    0.138    
## aBiW                        0.01854    0.00402   4.611  1.4e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F  p-value    
## s(FAM) 47.21     92 1.208 4.97e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.543   Deviance explained = 72.1%
## GCV =  576.9  Scale est. = 349.4     n = 140
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F p-value
## aBreastFeed_full_stopped   1  1.237   0.269
## aBreastFeed_full_duration  1  0.052   0.821
## aBreastFeed_total_stopped  1  0.038   0.847
## SEX                        1  0.007   0.932
## GRP                        2  1.617   0.205
## aBiW                       1 21.264 1.4e-05
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F  p-value
## s(FAM) 47.21  92.00 1.208 4.97e-05

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate   Std. Error      t value
## (Intercept)               -4.990866084 18.016806377 -0.277011696
## GRPVG                      0.858816421  7.718147703  0.111272348
## GRPVN                     -7.969334744  6.647194041 -1.198902077
## SEXM                      -0.163256291  4.309701831 -0.037881110
## aBreastFeed_full_stopped  -9.085867623 10.871090074 -0.835782572
## aBreastFeed_full_duration -0.008276443  1.070600889 -0.007730652
## aBreastFeed_total_stopped  1.170957605  5.259674358  0.222629297
## aBiW                       0.018461899  0.004343767  4.250204501


res <- emm(mod_main)
  
suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects regression modelling of age-standardized percentile of body height. CI_L and CI_U are bounds of 95% confidence interval. Estimates are based on bootstrap (5000 simulations)') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1.5in')


suppl_table
Results of linear mixed-effects regression modelling of age-standardized percentile of body height. CI_L and CI_U are bounds of 95% confidence interval. Estimates are based on bootstrap (5000 simulations)
Estimate CI-L CI-U P
VN vs OM -7.9693347 -20.99760 5.058926 0.2305660
VG vs OM 0.8588164 -14.26848 15.986108 0.9114004
VN vs VG -8.8281512 -22.23353 4.577226 0.1967938

Leave-one-factor lqm

Open code

### main model but non-robust
mod_main <- rlme(exclude = c('log2_age'),
                    include = c('aBiW'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme( exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('log2_age', 'SEX'), include = 'aBiW')

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('log2_age',
                             'aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'),  
                 include = 'aBiW')

### model without diet groups
mod_ndiet <- rlme(exclude = c('log2_age', 'GRP'), include = 'aBiW')

### model without birth weight
mod_other_cov <- rlme(exclude = c('log2_age'))


### model without log2_age
mod_log2_age <- NA
  # rlme(exclude = c('log2_age'), include = 'aBiW')

Putting key results together

Open code

## AIC
AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.459
##   Unadjusted ICC: 0.374

i = i+1

2.2.3 aBMI_PERC

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aBiW))

column_name
## [1] "aBMI_PERC"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aBiW +
    #s(log2_age)
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               12.581404  16.470294   0.764 0.446809    
## aBreastFeed_full_stopped   2.269435  10.243692   0.222 0.825139    
## aBreastFeed_full_duration -1.363984   1.012137  -1.348 0.180950    
## aBreastFeed_total_stopped  2.703043   4.836547   0.559 0.577546    
## SEXM                      -4.714809   4.025037  -1.171 0.244349    
## GRPVG                     -2.937324   6.270471  -0.468 0.640534    
## GRPVN                     -2.506189   5.441721  -0.461 0.646162    
## aBiW                       0.013496   0.003936   3.429 0.000895 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##        edf Ref.df     F p-value   
## s(FAM)  36     92 0.699 0.00299 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.365   Deviance explained = 56.2%
## GCV = 575.48  Scale est. = 394.62    n = 140
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1  0.049 0.825139
## aBreastFeed_full_duration  1  1.816 0.180950
## aBreastFeed_total_stopped  1  0.312 0.577546
## SEX                        1  1.372 0.244349
## GRP                        2  0.139 0.869990
## aBiW                       1 11.757 0.000895
## 
## Approximate significance of smooth terms:
##        edf Ref.df     F p-value
## s(FAM)  36     92 0.699 0.00299

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                            Estimate   Std. Error    t value
## (Intercept)                8.421909 17.170011614  0.4905011
## GRPVG                     -2.790478  7.092862170 -0.3934206
## GRPVN                     -2.318842  6.118688518 -0.3789769
## SEXM                      -4.369637  4.135927817 -1.0565072
## aBreastFeed_full_stopped   8.083229 10.452991829  0.7732933
## aBreastFeed_full_duration -1.831574  1.031813714 -1.7751019
## aBreastFeed_total_stopped  1.722133  5.018471399  0.3431590
## aBiW                       0.013933  0.004131799  3.3721383


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of age-standardized percentile of BMI. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of age-standardized percentile of BMI. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -2.318842 -14.31125 9.673567 0.7047050
VG vs OM -2.790478 -16.69223 11.111277 0.6940089
VN vs VG 0.471636 -11.87847 12.821746 0.9403351

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(exclude = c('log2_age'),
                    include = c('aBiW'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme( exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('log2_age', 'SEX'), include = 'aBiW')

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('log2_age',
                             'aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'),  
                 include = 'aBiW')

### model without diet groups
mod_ndiet <- rlme(exclude = c('log2_age', 'GRP'), include = 'aBiW')

### model without birth weight
mod_other_cov <- rlme(exclude = c('log2_age'))

### model without log2_age
mod_log2_age <- NA

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.318
##   Unadjusted ICC: 0.284

i = i+1

2.2.4 aM_per_H_PERC

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aBiW))

column_name
## [1] "aM_per_H_PERC"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aBiW +
    #s(log2_age)
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                4.77055   16.52976   0.289 0.773472    
## aBreastFeed_full_stopped  11.29903   10.34650   1.092 0.277387    
## aBreastFeed_full_duration -1.43693    1.02105  -1.407 0.162386    
## aBreastFeed_total_stopped  1.77547    4.86443   0.365 0.715877    
## SEXM                      -3.38832    4.05437  -0.836 0.405272    
## GRPVG                     -3.39585    6.14985  -0.552 0.582034    
## GRPVN                     -3.04783    5.35041  -0.570 0.570176    
## aBiW                       0.01345    0.00394   3.413 0.000922 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value  
## s(FAM) 30.32     92 0.531  0.0109 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.308   Deviance explained = 49.4%
## GCV = 586.82  Scale est. = 426.18    n = 140
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aBiW + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1  1.193 0.277387
## aBreastFeed_full_duration  1  1.981 0.162386
## aBreastFeed_total_stopped  1  0.133 0.715877
## SEX                        1  0.698 0.405272
## GRP                        2  0.204 0.815929
## aBiW                       1 11.652 0.000922
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value
## s(FAM) 30.32  92.00 0.531  0.0109

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate   Std. Error     t value
## (Intercept)                0.42752142 17.856789359  0.02394167
## GRPVG                     -3.66067251  6.472091557 -0.56560889
## GRPVN                     -3.16988489  5.650048603 -0.56103675
## SEXM                      -3.18334185  4.396888867 -0.72399871
## aBreastFeed_full_stopped  12.32997123 11.260511790  1.09497432
## aBreastFeed_full_duration -1.28144002  1.108615210 -1.15589251
## aBreastFeed_total_stopped  1.79114874  5.270951121  0.33981509
## aBiW                       0.01421465  0.004241191  3.35157052


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of age-standardized percentile of weight to height ratio. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of age-standardized percentile of weight to height ratio. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -3.1698849 -14.24378 7.904007 0.5747725
VG vs OM -3.6606725 -16.34574 9.024394 0.5716597
VN vs VG 0.4907876 -10.97281 11.954388 0.9331269

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(exclude = c('log2_age'),
                    include = c('aBiW'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme( exclude = c('log2_age'),
                    include = c('aBiW'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('log2_age', 'SEX'), include = 'aBiW')

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('log2_age',
                             'aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'),  
                 include = 'aBiW')

### model without diet groups
mod_ndiet <- rlme(exclude = c('log2_age', 'GRP'), include = 'aBiW')

### model without birth weight
mod_other_cov <- rlme(exclude = c('log2_age'))

### model without log2_age
mod_log2_age <- NA

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.259
##   Unadjusted ICC: 0.233

i = i+1

2.2.5 aGLY

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aGLY"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                4.53487    0.30614  14.813   <2e-16 ***
## aBreastFeed_full_stopped   0.06490    0.31825   0.204    0.839    
## aBreastFeed_full_duration -0.02608    0.02613  -0.998    0.320    
## aBreastFeed_total_stopped -0.02682    0.15763  -0.170    0.865    
## SEXM                       0.06084    0.10517   0.579    0.564    
## GRPVG                      0.01676    0.15147   0.111    0.912    
## GRPVN                     -0.03333    0.13080  -0.255    0.799    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F p-value
## s(log2_age)  1.0      1 0.030   0.864
## s(FAM)      12.9     87 0.182   0.159
## 
## R-sq.(adj) =  0.0775   Deviance explained = 21.8%
## GCV = 0.37835  Scale est. = 0.31845   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.042   0.839
## aBreastFeed_full_duration  1 0.996   0.320
## aBreastFeed_total_stopped  1 0.029   0.865
## SEX                        1 0.335   0.564
## GRP                        2 0.070   0.933
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F p-value
## s(log2_age)  1.0    1.0 0.030   0.864
## s(FAM)      12.9   87.0 0.182   0.159
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                2.168629   0.093955  23.082   <2e-16 ***
## aBreastFeed_full_stopped   0.018032   0.097713   0.185    0.854    
## aBreastFeed_full_duration -0.008025   0.008018  -1.001    0.319    
## aBreastFeed_total_stopped -0.014262   0.048375  -0.295    0.769    
## SEXM                       0.026719   0.032279   0.828    0.410    
## GRPVG                      0.006062   0.046325   0.131    0.896    
## GRPVN                     -0.006405   0.040026  -0.160    0.873    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00      1 0.053   0.818
## s(FAM)      11.97     87 0.165   0.185
## 
## R-sq.(adj) =  0.0718   Deviance explained = 20.6%
## GCV = 0.035654  Scale est. = 0.030261  n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.034   0.854
## aBreastFeed_full_duration  1 1.002   0.319
## aBreastFeed_total_stopped  1 0.087   0.769
## SEX                        1 0.685   0.410
## GRP                        2 0.042   0.959
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 0.053   0.818
## s(FAM)      11.97  87.00 0.165   0.185
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation did not improve the fit substantially. We will continue to work with original values.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                4.503637802 0.21008595 21.4371204
## GRPVG                      0.051518564 0.11291295  0.4562680
## GRPVN                      0.009939696 0.09840296  0.1010101
## SEXM                       0.083487939 0.08214344  1.0163677
## aBreastFeed_full_stopped   0.041847382 0.24976969  0.1675439
## aBreastFeed_full_duration -0.017282491 0.02036016 -0.8488388
## aBreastFeed_total_stopped -0.089759485 0.12301474 -0.7296645
## log2_age                  -0.022650928 0.05373160 -0.4215570


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of glycemia. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of glycemia. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.0099397 -0.1829266 0.2028060 0.9195424
VG vs OM 0.0515186 -0.1697868 0.2728239 0.6481973
VN vs VG -0.0415789 -0.2528550 0.1696973 0.6997053

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()

res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.105
##   Unadjusted ICC: 0.103

i = i+1

2.2.6 aTC

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aTC"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                3.86113    0.36676  10.528   <2e-16 ***
## aBreastFeed_full_stopped   0.21867    0.37496   0.583   0.5612    
## aBreastFeed_full_duration  0.01796    0.02778   0.647   0.5195    
## aBreastFeed_total_stopped -0.25101    0.17094  -1.468   0.1454    
## SEXM                      -0.05079    0.11245  -0.452   0.6525    
## GRPVG                     -0.19030    0.17308  -1.100   0.2744    
## GRPVN                     -0.31981    0.14833  -2.156   0.0337 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value  
## s(log2_age)  2.07   2.54 1.414  0.2166  
## s(FAM)      30.15  87.00 0.566  0.0108 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.313   Deviance explained = 51.3%
## GCV = 0.42491  Scale est. = 0.29867   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.340   0.561
## aBreastFeed_full_duration  1 0.418   0.519
## aBreastFeed_total_stopped  1 2.156   0.145
## SEX                        1 0.204   0.653
## GRP                        2 2.324   0.104
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  2.07   2.54 1.414  0.2166
## s(FAM)      30.15  87.00 0.566  0.0108
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error      t value
## (Intercept)                4.003671507 0.29977251 13.355699189
## GRPVG                     -0.182702640 0.17935394 -1.018670892
## GRPVN                     -0.320313125 0.15363806 -2.084855272
## SEXM                      -0.026242596 0.11688578 -0.224514872
## aBreastFeed_full_stopped  -0.002126509 0.35168225 -0.006046676
## aBreastFeed_full_duration  0.021438364 0.02910158  0.736673645
## aBreastFeed_total_stopped -0.283445212 0.17608030 -1.609749706
## log2_age                   0.035909877 0.07716968  0.465336582


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of total cholesterol. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of total cholesterol. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.3203131 -0.6214382 -0.0191881 0.0370824
VG vs OM -0.1827026 -0.5342299 0.1688246 0.3083592
VN vs VG -0.1376105 -0.4642840 0.1890631 0.4090138

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.273
##   Unadjusted ICC: 0.256

i = i+1

2.2.7 aHDL

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aHDL"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.2514911  0.1736247   7.208 3.35e-10 ***
## aBreastFeed_full_stopped   0.0784785  0.1753776   0.447   0.6558    
## aBreastFeed_full_duration  0.0006637  0.0131937   0.050   0.9600    
## aBreastFeed_total_stopped -0.1263664  0.0823466  -1.535   0.1290    
## SEXM                       0.0930663  0.0534365   1.742   0.0856 .  
## GRPVG                     -0.0716227  0.0905130  -0.791   0.4312    
## GRPVN                     -0.0147399  0.0771207  -0.191   0.8489    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  1.907  2.307 4.604 0.012109 *  
## s(FAM)      46.117 87.000 1.217 0.000132 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =    0.5   Deviance explained = 70.6%
## GCV = 0.095377  Scale est. = 0.055619  n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.200  0.6558
## aBreastFeed_full_duration  1 0.003  0.9600
## aBreastFeed_total_stopped  1 2.355  0.1290
## SEX                        1 3.033  0.0856
## GRP                        2 0.347  0.7082
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  1.907  2.307 4.604 0.012109
## s(FAM)      46.117 87.000 1.217 0.000132
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                0.235071   0.200055   1.175   0.2437  
## aBreastFeed_full_stopped   0.123274   0.202090   0.610   0.5437  
## aBreastFeed_full_duration  0.000145   0.014669   0.010   0.9921  
## aBreastFeed_total_stopped -0.149691   0.092340  -1.621   0.1092  
## SEXM                       0.103830   0.059581   1.743   0.0855 .
## GRPVG                     -0.054645   0.102429  -0.533   0.5953  
## GRPVN                      0.006416   0.087248   0.074   0.9416  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value    
## s(log2_age)  2.26  2.739 4.814  0.00724 ** 
## s(FAM)      48.35 87.000 1.388 3.86e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.536   Deviance explained = 73.7%
## GCV = 0.11818  Scale est. = 0.066604  n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.372  0.5437
## aBreastFeed_full_duration  1 0.000  0.9921
## aBreastFeed_total_stopped  1 2.628  0.1092
## SEX                        1 3.037  0.0855
## GRP                        2 0.228  0.7970
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.260  2.739 4.814  0.00724
## s(FAM)      48.348 87.000 1.388 3.86e-05
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Diagnostic plot looks better now, but the difference in not huge. We will use original values to unify it with LDL analysis where transformation is unsuitable.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)

summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                1.221828302 0.13135304  9.3018652
## GRPVG                     -0.062871348 0.09315387 -0.6749193
## GRPVN                     -0.011681354 0.07916009 -0.1475662
## SEXM                       0.103468531 0.05047009  2.0500962
## aBreastFeed_full_stopped  -0.062432570 0.14952516 -0.4175389
## aBreastFeed_full_duration  0.001456792 0.01255316  0.1160498
## aBreastFeed_total_stopped -0.134853396 0.07839031 -1.7202815
## log2_age                   0.100586629 0.03344898  3.0071657


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of HDL cholesterol level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of HDL cholesterol level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.0116814 -0.1668323 0.1434696 0.8826851
VG vs OM -0.0628713 -0.2454496 0.1197069 0.4997270
VN vs VG 0.0511900 -0.1146963 0.2170763 0.5453018

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)


### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.466
##   Unadjusted ICC: 0.429

i = i+1

2.2.8 aLDL

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aLDL"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.92602    0.31511   6.112 3.04e-08 ***
## aBreastFeed_full_stopped   0.39351    0.32049   1.228   0.2230    
## aBreastFeed_full_duration  0.01220    0.02351   0.519   0.6051    
## aBreastFeed_total_stopped -0.20773    0.14604  -1.422   0.1586    
## SEXM                      -0.07770    0.09530  -0.815   0.4172    
## GRPVG                     -0.01018    0.15442  -0.066   0.9476    
## GRPVN                     -0.30283    0.13186  -2.297   0.0242 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  2.17  2.643 0.754 0.51430   
## s(FAM)      39.64 87.000 0.859 0.00206 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.404   Deviance explained = 62.1%
## GCV = 0.30373  Scale est. = 0.19143   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.508   0.223
## aBreastFeed_full_duration  1 0.269   0.605
## aBreastFeed_total_stopped  1 2.023   0.159
## SEX                        1 0.665   0.417
## GRP                        2 3.491   0.035
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.170  2.643 0.754 0.51430
## s(FAM)      39.637 87.000 0.859 0.00206
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.082642   0.299259   3.618 0.000597 ***
## aBreastFeed_full_stopped   0.203347   0.304221   0.668 0.506334    
## aBreastFeed_full_duration  0.001904   0.019061   0.100 0.920751    
## aBreastFeed_total_stopped -0.245693   0.124691  -1.970 0.053239 .  
## SEXM                      -0.071477   0.078326  -0.913 0.364989    
## GRPVG                     -0.036316   0.147856  -0.246 0.806786    
## GRPVN                     -0.223078   0.125665  -1.775 0.080753 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  4.019  4.773 1.768    0.103    
## s(FAM)      58.670 87.000 1.964 8.11e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.595   Deviance explained = 80.8%
## GCV = 0.20303  Scale est. = 0.09584   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.447  0.5063
## aBreastFeed_full_duration  1 0.010  0.9208
## aBreastFeed_total_stopped  1 3.883  0.0532
## SEX                        1 0.833  0.3650
## GRP                        2 1.908  0.1570
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  4.019  4.773 1.768    0.103
## s(FAM)      58.670 87.000 1.964 8.11e-06
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation did not improve the fit substantially. We will continue to work with original values.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                1.911899592 0.24645384  7.7576375
## GRPVG                      0.078278763 0.13245927  0.5909648
## GRPVN                     -0.252304231 0.11543746 -2.1856357
## SEXM                      -0.046105284 0.09636326 -0.4784529
## aBreastFeed_full_stopped   0.369156136 0.29300722  1.2598875
## aBreastFeed_full_duration  0.007058201 0.02388470  0.2955114
## aBreastFeed_total_stopped -0.117545760 0.14430977 -0.8145378
## log2_age                  -0.038489757 0.06303305 -0.6106282


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of LDL cholesterol level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of LDL cholesterol level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.2523042 -0.4785575 -0.0260510 0.0288423
VG vs OM 0.0782788 -0.1813366 0.3378942 0.5545440
VN vs VG -0.3305830 -0.5784331 -0.0827329 0.0089436

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.322
##   Unadjusted ICC: 0.293

i = i+1

2.2.9 aTG

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aTG"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.6513536  0.2834306   5.826 8.96e-08 ***
## aBreastFeed_full_stopped  -0.5959491  0.2909482  -2.048   0.0435 *  
## aBreastFeed_full_duration  0.0008207  0.0241676   0.034   0.9730    
## aBreastFeed_total_stopped  0.1657157  0.1468551   1.128   0.2622    
## SEXM                      -0.1516153  0.0970188  -1.563   0.1217    
## GRPVG                     -0.2480591  0.1546691  -1.604   0.1123    
## GRPVN                     -0.0277352  0.1320841  -0.210   0.8342    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 4.257 0.04201 * 
## s(FAM)      35.26     87 0.754 0.00222 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.409   Deviance explained =   60%
## GCV = 0.31937  Scale est. = 0.2147    n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 4.196  0.0435
## aBreastFeed_full_duration  1 0.001  0.9730
## aBreastFeed_total_stopped  1 1.273  0.2622
## SEX                        1 2.442  0.1217
## GRP                        2 1.542  0.2197
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 4.257 0.04201
## s(FAM)      35.26  87.00 0.754 0.00222
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                0.369053   0.329267   1.121   0.2654  
## aBreastFeed_full_stopped  -0.357689   0.337751  -1.059   0.2925  
## aBreastFeed_full_duration -0.013942   0.028070  -0.497   0.6206  
## aBreastFeed_total_stopped  0.176375   0.170698   1.033   0.3043  
## SEXM                      -0.195188   0.112680  -1.732   0.0867 .
## GRPVG                     -0.310227   0.180657  -1.717   0.0895 .
## GRPVN                     -0.001684   0.154220  -0.011   0.9913  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value    
## s(log2_age)  1.00      1 4.595 0.034826 *  
## s(FAM)      36.28     87 0.824 0.000942 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.426   Deviance explained = 61.6%
## GCV = 0.43062  Scale est. = 0.28616   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.122  0.2925
## aBreastFeed_full_duration  1 0.247  0.6206
## aBreastFeed_total_stopped  1 1.068  0.3043
## SEX                        1 3.001  0.0867
## GRP                        2 1.991  0.1427
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value
## s(log2_age)  1.00   1.00 4.595 0.034826
## s(FAM)      36.28  87.00 0.824 0.000942
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome)
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                0.693242766 0.28059330  2.4706320
## GRPVG                     -0.306006754 0.17961065 -1.7037227
## GRPVN                     -0.008325549 0.15309046 -0.0543832
## SEXM                      -0.201858851 0.10897611 -1.8523220
## aBreastFeed_full_stopped  -0.425854562 0.32588729 -1.3067541
## aBreastFeed_full_duration -0.018184420 0.02714738 -0.6698406
## aBreastFeed_total_stopped  0.145466145 0.16578495  0.8774388
## log2_age                  -0.151948674 0.07211229 -2.1071120


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of TG cholesterol level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of TG cholesterol level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.0083255 -0.3083773 0.2917262 0.9566299
VG vs OM -0.3060068 -0.6580372 0.0460237 0.0884329
VN vs VG 0.2976812 -0.0258080 0.6211704 0.0712944

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.378
##   Unadjusted ICC: 0.323

i = i+1

2.2.10 aCa

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aCa"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                2.624678   0.047059  55.774   <2e-16 ***
## aBreastFeed_full_stopped  -0.014348   0.047525  -0.302    0.764    
## aBreastFeed_full_duration -0.002326   0.003872  -0.601    0.550    
## aBreastFeed_total_stopped -0.015607   0.023909  -0.653    0.516    
## SEXM                      -0.021236   0.015535  -1.367    0.176    
## GRPVG                      0.009508   0.027247   0.349    0.728    
## GRPVN                     -0.010115   0.023057  -0.439    0.662    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(log2_age)  1.20  1.327 21.853 3.79e-06 ***
## s(FAM)      49.97 88.000  1.312 0.000134 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.657   Deviance explained = 80.6%
## GCV = 0.0081933  Scale est. = 0.0046099  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.091   0.764
## aBreastFeed_full_duration  1 0.361   0.550
## aBreastFeed_total_stopped  1 0.426   0.516
## SEX                        1 1.869   0.176
## GRP                        2 0.327   0.722
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F  p-value
## s(log2_age)  1.200  1.327 21.853 3.79e-06
## s(FAM)      49.968 88.000  1.312 0.000134
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, 
       remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate  Std. Error    t value
## (Intercept)                2.713934960 0.041768239 64.9760452
## GRPVG                      0.007421540 0.028911488  0.2566986
## GRPVN                     -0.016722252 0.024445181 -0.6840715
## SEXM                      -0.019231319 0.016029566 -1.1997404
## aBreastFeed_full_stopped  -0.019544360 0.047797391 -0.4089001
## aBreastFeed_full_duration -0.002265238 0.004002109 -0.5660112
## aBreastFeed_total_stopped -0.025071669 0.024749264 -1.0130269
## log2_age                  -0.054295935 0.010653890 -5.0963485


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of of serum Ca level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of of serum Ca level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.0167223 -0.0646339 0.0311894 0.4939300
VG vs OM 0.0074215 -0.0492439 0.0640870 0.7974114
VN vs VG -0.0241438 -0.0756253 0.0273377 0.3579991

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(,remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Ca'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(
        exclude = c('aBreastFeed_full_stopped',
                    'aBreastFeed_total_stopped',
                    'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Ca'),
        exclude = c('log2_age'))

Putting key results together

Open code

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.461
##   Unadjusted ICC: 0.291

i = i+1

2.2.11 aP

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aP"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.595472   0.073719  21.643   <2e-16 ***
## aBreastFeed_full_stopped   0.057788   0.075698   0.763    0.447    
## aBreastFeed_full_duration -0.002964   0.006282  -0.472    0.638    
## aBreastFeed_total_stopped  0.060870   0.038009   1.601    0.113    
## SEXM                      -0.004426   0.025120  -0.176    0.861    
## GRPVG                     -0.005682   0.040220  -0.141    0.888    
## GRPVN                     -0.029373   0.034163  -0.860    0.392    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value    
## s(log2_age)  1.00      1 39.40 < 2e-16 ***
## s(FAM)      35.57     88  0.78 0.00128 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.528   Deviance explained =   68%
## GCV = 0.021628  Scale est. = 0.014543  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.583   0.447
## aBreastFeed_full_duration  1 0.223   0.638
## aBreastFeed_total_stopped  1 2.565   0.113
## SEX                        1 0.031   0.861
## GRP                        2 0.431   0.651
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 39.40 < 2e-16
## s(FAM)      35.57  88.00  0.78 0.00128
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, 
       remove_random = FALSE),
  
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate  Std. Error     t value
## (Intercept)                1.736445491 0.070424320 24.65690107
## GRPVG                      0.003339893 0.045057853  0.07412454
## GRPVN                     -0.027644335 0.038198440 -0.72370323
## SEXM                      -0.004288312 0.027220581 -0.15753934
## aBreastFeed_full_stopped   0.059597041 0.081784447  0.72870874
## aBreastFeed_full_duration -0.002388312 0.006806509 -0.35088655
## aBreastFeed_total_stopped  0.070362794 0.041391529  1.69993224
## log2_age                  -0.109035378 0.018074546 -6.03253759


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of serum P level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum P level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.0276443 -0.1025119 0.0472232 0.4692479
VG vs OM 0.0033399 -0.0849719 0.0916517 0.9409113
VN vs VG -0.0309842 -0.1119776 0.0500091 0.4533815

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.365
##   Unadjusted ICC: 0.252


i = i+1

2.2.12 aMg

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Mg))

column_name
## [1] "aMg"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Mg +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Mg + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                0.878568   0.033199  26.464   <2e-16 ***
## aBreastFeed_full_stopped   0.011782   0.033802   0.349   0.7284    
## aBreastFeed_full_duration -0.004623   0.002812  -1.644   0.1041    
## aBreastFeed_total_stopped -0.029286   0.017234  -1.699   0.0932 .  
## SEXM                       0.010022   0.011258   0.890   0.3761    
## GRPVG                      0.028243   0.019284   1.465   0.1470    
## GRPVN                      0.028223   0.016349   1.726   0.0883 .  
## aSUP_Mg                   -0.029670   0.031203  -0.951   0.3446    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 2.837 0.09614 . 
## s(FAM)      45.82     88 1.039 0.00116 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.517   Deviance explained = 71.4%
## GCV = 0.0043252  Scale est. = 0.0025426  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Mg + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.121  0.7284
## aBreastFeed_full_duration  1 2.704  0.1041
## aBreastFeed_total_stopped  1 2.888  0.0932
## SEX                        1 0.792  0.3761
## GRP                        2 1.703  0.1887
## aSUP_Mg                    1 0.904  0.3446
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 2.837 0.09614
## s(FAM)      45.82  88.00 1.039 0.00116
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Mg'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate  Std. Error    t value
## (Intercept)                0.870776459 0.028291003 30.7792712
## GRPVG                      0.025967756 0.015360512  1.6905527
## GRPVN                      0.028799596 0.013298962  2.1655522
## SEXM                       0.004882369 0.011016502  0.4431869
## aBreastFeed_full_stopped   0.030128945 0.033718439  0.8935451
## aBreastFeed_full_duration -0.003989540 0.002740525 -1.4557576
## aBreastFeed_total_stopped -0.028843504 0.016589529 -1.7386571
## log2_age                  -0.012765641 0.007535495 -1.6940679
## aSUP_Mg                   -0.016825712 0.026210038 -0.6419568


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of of serum Mg level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of of serum Mg level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.0287996 0.0027341 0.0548651 0.0303454
VG vs OM 0.0259678 -0.0041383 0.0560738 0.0909223
VN vs VG 0.0028318 -0.0255770 0.0312407 0.8451016

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Mg'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Mg'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Mg'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Mg'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(include = c('aSUP_Mg'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Mg'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)


## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.345
##   Unadjusted ICC: 0.269

i = i+1

2.2.13 aSe

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Se))

column_name
## [1] "aSe"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Se +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                0.594086   0.171839   3.457 0.000929 ***
## aBreastFeed_full_stopped   0.288464   0.176778   1.632 0.107185    
## aBreastFeed_full_duration -0.014666   0.011253  -1.303 0.196707    
## aBreastFeed_total_stopped  0.005528   0.071658   0.077 0.938724    
## SEXM                       0.016765   0.046002   0.364 0.716612    
## GRPVG                      0.019396   0.081101   0.239 0.811673    
## GRPVN                     -0.058158   0.068464  -0.849 0.398492    
## aSUP_Se                    0.437721   0.194520   2.250 0.027556 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  3.434  4.118 0.727    0.649    
## s(FAM)      49.096 85.000 1.423 6.68e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.513   Deviance explained = 73.6%
## GCV = 0.069977  Scale est. = 0.037643  n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 2.663  0.1072
## aBreastFeed_full_duration  1 1.699  0.1967
## aBreastFeed_total_stopped  1 0.006  0.9387
## SEX                        1 0.133  0.7166
## GRP                        2 0.655  0.5226
## aSUP_Se                    1 5.064  0.0276
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  3.434  4.118 0.727    0.649
## s(FAM)      49.096 85.000 1.423 6.68e-05
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Se +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)  
## (Intercept)               -0.794784   0.310834  -2.557   0.0132 *
## aBreastFeed_full_stopped   0.570054   0.316266   1.802   0.0767 .
## aBreastFeed_full_duration -0.027189   0.018976  -1.433   0.1573  
## aBreastFeed_total_stopped  0.002423   0.125417   0.019   0.9847  
## SEXM                       0.029882   0.078284   0.382   0.7041  
## GRPVG                     -0.010152   0.156673  -0.065   0.9486  
## GRPVN                     -0.148889   0.132074  -1.127   0.2643  
## aSUP_Se                    0.762199   0.375616   2.029   0.0470 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value    
## s(log2_age)  4.102  4.874 1.375   0.349    
## s(FAM)      60.933 85.000 2.654  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.663   Deviance explained =   85%
## GCV = 0.20133  Scale est. = 0.089087  n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 3.249  0.0767
## aBreastFeed_full_duration  1 2.053  0.1573
## aBreastFeed_total_stopped  1 0.000  0.9847
## SEX                        1 0.146  0.7041
## GRP                        2 0.811  0.4494
## aSUP_Se                    1 4.118  0.0470
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  4.102  4.874 1.375   0.349
## s(FAM)      60.933 85.000 2.654  <2e-16
plot(gamm, select = 1)

Open code

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.345
##   Unadjusted ICC: 0.269

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The distribution of residuals seem a little bit better. Lets continue continue on log-scale

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome)
column_name <- paste0("log2_", column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Se'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error     t value
## (Intercept)               -0.644653891 0.18240050 -3.53427706
## GRPVG                     -0.015944069 0.15383933 -0.10364105
## GRPVN                     -0.168973335 0.13004310 -1.29936412
## SEXM                      -0.014480457 0.06752899 -0.21443320
## aBreastFeed_full_stopped   0.471299911 0.19853369  2.37390392
## aBreastFeed_full_duration -0.029831051 0.01675827 -1.78007934
## aBreastFeed_total_stopped  0.006874390 0.10909572  0.06301247
## log2_age                  -0.004513157 0.04539399 -0.09942188
## aSUP_Se                    0.783071123 0.36996676  2.11659856


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(serum Se level). CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(serum Se level). CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1689733 -0.4238531 0.0859065 0.1938190
VG vs OM -0.0159441 -0.3174636 0.2855755 0.9174542
VN vs VG -0.1530293 -0.4286581 0.1225996 0.2765187

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Se'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Se'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Se'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Se'),
        exclude = c('aBreastFeed_full_stopped',
                    'aBreastFeed_total_stopped',
                    'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Se'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Se'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.622
##   Unadjusted ICC: 0.576

i = i+1

2.2.14 aZn

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Zn))

column_name
## [1] "aZn"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Zn +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Zn + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               11.91857    1.23740   9.632 1.08e-15 ***
## aBreastFeed_full_stopped   0.41895    1.28687   0.326    0.745    
## aBreastFeed_full_duration -0.01012    0.10662  -0.095    0.925    
## aBreastFeed_total_stopped -0.27291    0.63905  -0.427    0.670    
## SEXM                       0.03881    0.41963   0.092    0.927    
## GRPVG                     -0.61041    0.65565  -0.931    0.354    
## GRPVN                     -0.85555    0.55998  -1.528    0.130    
## aSUP_Zn                   -0.30574    1.26267  -0.242    0.809    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value  
## s(log2_age)  1.00      1 1.367  0.2452  
## s(FAM)      27.65     85 0.523  0.0138 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.257   Deviance explained = 46.1%
## GCV =  5.998  Scale est. = 4.3199    n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Zn + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.106   0.745
## aBreastFeed_full_duration  1 0.009   0.925
## aBreastFeed_total_stopped  1 0.182   0.670
## SEX                        1 0.009   0.927
## GRP                        2 1.197   0.307
## aSUP_Zn                    1 0.059   0.809
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 1.367  0.2452
## s(FAM)      27.65  85.00 0.523  0.0138
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Se +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                3.5429675  0.1484454  23.867   <2e-16 ***
## aBreastFeed_full_stopped   0.0541757  0.1534889   0.353    0.725    
## aBreastFeed_full_duration  0.0002158  0.0126196   0.017    0.986    
## aBreastFeed_total_stopped -0.0448395  0.0762395  -0.588    0.558    
## SEXM                      -0.0009393  0.0505791  -0.019    0.985    
## GRPVG                     -0.0641195  0.0800983  -0.801    0.426    
## GRPVN                     -0.1099749  0.0680649  -1.616    0.110    
## aSUP_Se                    0.0044955  0.1935471   0.023    0.982    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 1.273 0.26211   
## s(FAM)      31.01     85 0.633 0.00574 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.297   Deviance explained = 50.8%
## GCV = 0.087011  Scale est. = 0.060437  n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.125   0.725
## aBreastFeed_full_duration  1 0.000   0.986
## aBreastFeed_total_stopped  1 0.346   0.558
## SEX                        1 0.000   0.985
## GRP                        2 1.305   0.276
## aSUP_Se                    1 0.001   0.982
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 1.273 0.26211
## s(FAM)      31.01  85.00 0.633 0.00574
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The distribution of residuals seem better. Lets continue continue on log-scale

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome)
column_name <- paste0("log2_", column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Zn'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error     t value
## (Intercept)                3.515993174 0.11505756 30.55855956
## GRPVG                     -0.081308430 0.07593308 -1.07079066
## GRPVN                     -0.127464488 0.06450398 -1.97607183
## SEXM                      -0.001046543 0.04459465 -0.02346791
## aBreastFeed_full_stopped   0.026217017 0.13559335  0.19335032
## aBreastFeed_full_duration  0.002716316 0.01131953  0.23996733
## aBreastFeed_total_stopped -0.060183499 0.06881149 -0.87461404
## log2_age                   0.045844183 0.03062325  1.49703855
## aSUP_Zn                   -0.040047731 0.14282205 -0.28040300


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of serum Zn level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum Zn level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1274645 -0.2538900 -0.0010390 0.0481466
VG vs OM -0.0813084 -0.2301345 0.0675177 0.2842636
VN vs VG -0.0461561 -0.1876719 0.0953598 0.5226594

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Zn'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Zn'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Zn'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Zn'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Zn'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Zn'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.299
##   Unadjusted ICC: 0.285

i = i+1

2.2.15 aFE

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aFE"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                13.8736     3.4412   4.032 0.000113 ***
## aBreastFeed_full_stopped   -2.0134     3.4728  -0.580 0.563482    
## aBreastFeed_full_duration   0.1975     0.2830   0.698 0.486995    
## aBreastFeed_total_stopped   0.2402     1.7515   0.137 0.891225    
## SEXM                       -0.6136     1.1312  -0.542 0.588854    
## GRPVG                      -0.6106     1.7671  -0.346 0.730480    
## GRPVN                       1.4615     1.5049   0.971 0.333999    
## aSUP_Fe                    -0.3420     2.3990  -0.143 0.886946    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value   
## s(log2_age)  1.076  1.134 7.945 0.00454 **
## s(FAM)      30.643 88.000 0.591 0.00690 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.356   Deviance explained = 54.5%
## GCV =   44.2  Scale est. = 31        n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.336   0.563
## aBreastFeed_full_duration  1 0.487   0.487
## aBreastFeed_total_stopped  1 0.019   0.891
## SEX                        1 0.294   0.589
## GRP                        2 0.948   0.391
## aSUP_Fe                    1 0.020   0.887
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  1.076  1.134 7.945 0.00454
## s(FAM)      30.643 88.000 0.591 0.00690
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Fe'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                             Estimate Std. Error     t value
## (Intercept)               10.7708087  2.9914062  3.60058384
## GRPVG                     -0.7739504  1.5835387 -0.48874736
## GRPVN                      1.8052234  1.3738149  1.31402229
## SEXM                      -1.1644473  1.1466696 -1.01550374
## aBreastFeed_full_stopped  -2.1540705  3.5264332 -0.61083549
## aBreastFeed_full_duration  0.1824411  0.2853859  0.63927875
## aBreastFeed_total_stopped  0.1288869  1.7691646  0.07285186
## log2_age                   2.3807006  0.7794891  3.05418070
## aSUP_Fe                    0.1077969  2.2654153  0.04758371


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of serum Fe level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum Fe level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 1.8052234 -0.8874043 4.497851 0.1888387
VG vs OM -0.7739504 -3.8776292 2.329728 0.6250206
VN vs VG 2.5791737 -0.3786597 5.537007 0.0874415

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'), 
                   remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'), 
                 exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Fe'),
        exclude = c('aBreastFeed_full_stopped',
                    'aBreastFeed_total_stopped',
                    'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.295
##   Unadjusted ICC: 0.252

i = i+1

2.2.16 aVKFE

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aVKFE"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                62.3823     4.4650  13.971  < 2e-16 ***
## aBreastFeed_full_stopped   13.2154     4.4888   2.944  0.00419 ** 
## aBreastFeed_full_duration  -0.5952     0.3710  -1.604  0.11239    
## aBreastFeed_total_stopped   1.3796     2.3093   0.597  0.55185    
## SEXM                        0.7382     1.4869   0.496  0.62084    
## GRPVG                      -0.6800     2.4266  -0.280  0.78000    
## GRPVN                      -3.7722     2.0706  -1.822  0.07204 .  
## aSUP_Fe                     7.0698     3.2251   2.192  0.03114 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value    
## s(log2_age)  1.00      1 2.118 0.149349    
## s(FAM)      38.95     87 0.897 0.000901 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.422   Deviance explained = 62.9%
## GCV = 75.421  Scale est. = 48.024    n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 8.667 0.00419
## aBreastFeed_full_duration  1 2.574 0.11239
## aBreastFeed_total_stopped  1 0.357 0.55185
## SEX                        1 0.247 0.62084
## GRP                        2 1.955 0.14791
## aSUP_Fe                    1 4.805 0.03114
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value
## s(log2_age)  1.00   1.00 2.118 0.149349
## s(FAM)      38.95  87.00 0.897 0.000901
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Se +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                5.983017   0.088714  67.441   <2e-16 ***
## aBreastFeed_full_stopped   0.273358   0.090998   3.004   0.0035 ** 
## aBreastFeed_full_duration -0.011441   0.007584  -1.509   0.1351    
## aBreastFeed_total_stopped  0.005150   0.046104   0.112   0.9113    
## SEXM                       0.011662   0.030345   0.384   0.7017    
## GRPVG                     -0.017728   0.049211  -0.360   0.7196    
## GRPVN                     -0.073403   0.042327  -1.734   0.0865 .  
## aSUP_Se                    0.015734   0.120183   0.131   0.8962    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value    
## s(log2_age)  1.0      1 1.055 0.307258    
## s(FAM)      37.9     86 0.886 0.000824 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.409   Deviance explained = 61.6%
## GCV = 0.031298  Scale est. = 0.020177  n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 9.024  0.0035
## aBreastFeed_full_duration  1 2.276  0.1351
## aBreastFeed_total_stopped  1 0.012  0.9113
## SEX                        1 0.148  0.7017
## GRP                        2 1.681  0.1922
## aSUP_Se                    1 0.017  0.8962
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value
## s(log2_age)  1.0    1.0 1.055 0.307258
## s(FAM)      37.9   86.0 0.886 0.000824
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation did not improve the fit substantially. We will continue with the original scale.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Fe'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                             Estimate Std. Error    t value
## (Intercept)               64.4059077  4.0633714 15.8503621
## GRPVG                     -1.4747728  2.1520674 -0.6852819
## GRPVN                     -4.0158026  1.8766054 -2.1399291
## SEXM                       1.2272099  1.5655842  0.7838671
## aBreastFeed_full_stopped  13.7748521  4.7908254  2.8752565
## aBreastFeed_full_duration -0.6906090  0.3880067 -1.7798896
## aBreastFeed_total_stopped -0.5705518  2.4145215 -0.2363002
## log2_age                  -0.4710794  1.0602540 -0.4443081
## aSUP_Fe                    4.7728782  3.0771535  1.5510693


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of serum ferritin Fe binding capacity. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum ferritin Fe binding capacity. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -4.015803 -7.693882 -0.3377235 0.0323605
VG vs OM -1.474773 -5.692747 2.7432018 0.4931661
VN vs VG -2.541030 -6.566632 1.4845727 0.2160265

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(
  include = c('aSUP_Fe'),
  exclude = c('aBreastFeed_full_stopped',
              'aBreastFeed_total_stopped',
              'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.386
##   Unadjusted ICC: 0.342

i = i+1

2.2.17 aFERR

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aFERR"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                83.5495     9.9900   8.363 3.72e-13 ***
## aBreastFeed_full_stopped  -62.3996    10.2822  -6.069 2.32e-08 ***
## aBreastFeed_full_duration   0.5551     0.5726   0.970   0.3346    
## aBreastFeed_total_stopped  -7.2663     3.6290  -2.002   0.0480 *  
## SEXM                        0.1113     2.3019   0.048   0.9615    
## GRPVG                      -5.0793     3.3969  -1.495   0.1380    
## GRPVN                      -3.4860     2.9029  -1.201   0.2326    
## aSUP_Fe                   -10.8859     4.6701  -2.331   0.0218 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value   
## s(log2_age)  7.764  8.511 3.186  0.0022 **
## s(FAM)      17.186 88.000 0.272  0.0562 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.382   Deviance explained = 53.1%
## GCV = 185.64  Scale est. = 139.65    n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 36.829 2.32e-08
## aBreastFeed_full_duration  1  0.940   0.3346
## aBreastFeed_total_stopped  1  4.009   0.0480
## SEX                        1  0.002   0.9615
## GRP                        2  1.257   0.2890
## aSUP_Fe                    1  5.433   0.0218
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  7.764  8.511 3.186  0.0022
## s(FAM)      17.186 88.000 0.272  0.0562
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Se +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                5.60045    0.41955  13.349  < 2e-16 ***
## aBreastFeed_full_stopped  -1.63261    0.43461  -3.757  0.00029 ***
## aBreastFeed_full_duration  0.03298    0.03597   0.917  0.36133    
## aBreastFeed_total_stopped -0.12395    0.21573  -0.575  0.56688    
## SEXM                      -0.12517    0.14330  -0.873  0.38451    
## GRPVG                     -0.24556    0.21730  -1.130  0.26116    
## GRPVN                     -0.27089    0.18700  -1.449  0.15057    
## aSUP_Se                    0.53357    0.53390   0.999  0.32003    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(log2_age)  1.00      1 12.157 0.000728 ***
## s(FAM)      24.25     87  0.421 0.025699 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.339   Deviance explained = 50.1%
## GCV = 0.70899  Scale est. = 0.53172   n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Se + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F p-value
## aBreastFeed_full_stopped   1 14.111 0.00029
## aBreastFeed_full_duration  1  0.841 0.36133
## aBreastFeed_total_stopped  1  0.330 0.56688
## SEX                        1  0.763 0.38451
## GRP                        2  1.163 0.31660
## aSUP_Se                    1  0.999 0.32003
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value
## s(log2_age)  1.00   1.00 12.157 0.000728
## s(FAM)      24.25  87.00  0.421 0.025699
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. It seems that the aFERR change in multiplicative, rather than additive way. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome)
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Fe'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                5.13335065 0.36414370 14.0970465
## GRPVG                     -0.19398462 0.19276407 -1.0063318
## GRPVN                     -0.25351004 0.16723441 -1.5158964
## SEXM                      -0.15726658 0.13958403 -1.1266804
## aBreastFeed_full_stopped  -1.59228598 0.42927251 -3.7092661
## aBreastFeed_full_duration  0.04131011 0.03474001  1.1891219
## aBreastFeed_total_stopped -0.17718814 0.21536031 -0.8227521
## log2_age                   0.32987192 0.09488716  3.4764651
## aSUP_Fe                   -0.44268830 0.27576888 -1.6052874


res <- emm(mod_main)

suppl_table <- kbl(
  res, 
  caption =
    'Results of linear mixed-effects model of serum ferritin levels. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum ferritin levels. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.2535100 -0.5812835 0.0742634 0.1295455
VG vs OM -0.1939846 -0.5717953 0.1838260 0.3142560
VN vs VG -0.0595254 -0.4195823 0.3005315 0.7459186

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),
                   remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),
                 exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Fe'),
        exclude = c('aBreastFeed_full_stopped',
                    'aBreastFeed_total_stopped',
                    'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.272
##   Unadjusted ICC: 0.214

i = i+1

2.2.18 aTRF

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aTRF"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                2.47916    0.17422  14.230  < 2e-16 ***
## aBreastFeed_full_stopped   0.53068    0.17519   3.029  0.00325 ** 
## aBreastFeed_full_duration -0.02465    0.01447  -1.704  0.09212 .  
## aBreastFeed_total_stopped  0.04163    0.08971   0.464  0.64381    
## SEXM                       0.04111    0.05776   0.712  0.47851    
## GRPVG                     -0.03762    0.09471  -0.397  0.69217    
## GRPVN                     -0.15858    0.08038  -1.973  0.05177 .  
## aSUP_Fe                    0.28866    0.12591   2.293  0.02435 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 1.820 0.18085   
## s(FAM)      39.39     88 0.879 0.00114 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.425   Deviance explained = 63.1%
## GCV = 0.11495  Scale est. = 0.073131  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 9.176 0.00325
## aBreastFeed_full_duration  1 2.902 0.09212
## aBreastFeed_total_stopped  1 0.215 0.64381
## SEX                        1 0.507 0.47851
## GRP                        2 2.208 0.11618
## aSUP_Fe                    1 5.256 0.02435
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 1.820 0.18085
## s(FAM)      39.39  88.00 0.879 0.00114
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,
       include = c('aSUP_Fe'),
       remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                2.54999401 0.15553735 16.3947377
## GRPVG                     -0.07110078 0.08579585 -0.8287205
## GRPVN                     -0.16624507 0.07379917 -2.2526685
## SEXM                       0.05247112 0.05951443  0.8816536
## aBreastFeed_full_stopped   0.54626083 0.18242962  2.9943647
## aBreastFeed_full_duration -0.02602305 0.01485284 -1.7520594
## aBreastFeed_total_stopped -0.01540385 0.09182296 -0.1677560
## log2_age                  -0.02651447 0.04057780 -0.6534231
## aSUP_Fe                    0.20447129 0.12067443  1.6944045


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of serum transferin level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of serum transferin level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1662451 -0.3108888 -0.0216013 0.0242801
VG vs OM -0.0711008 -0.2392576 0.0970560 0.4072626
VN vs VG -0.0951443 -0.2535254 0.0632368 0.2390322

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(
  include = c('aSUP_Fe'),
  exclude = c('aBreastFeed_full_stopped',
                  'aBreastFeed_total_stopped',
                  'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.377
##   Unadjusted ICC: 0.330

i = i+1

2.2.19 aSATTRF

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]



dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aSATTRF"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               21.97906    5.09171   4.317  3.9e-05 ***
## aBreastFeed_full_stopped  -6.70051    5.12459  -1.308    0.194    
## aBreastFeed_full_duration  0.44533    0.40280   1.106    0.272    
## aBreastFeed_total_stopped  0.18633    2.50413   0.074    0.941    
## SEXM                      -0.94681    1.61532  -0.586    0.559    
## GRPVG                     -0.04314    2.49683  -0.017    0.986    
## GRPVN                      3.18337    2.12907   1.495    0.138    
## aSUP_Fe                   -2.72106    3.40006  -0.800    0.426    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value   
## s(log2_age)  1.403  1.654 6.243 0.00523 **
## s(FAM)      29.151 88.000 0.545 0.00980 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.348   Deviance explained = 53.4%
## GCV = 89.747  Scale est. = 63.731    n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.710   0.194
## aBreastFeed_full_duration  1 1.222   0.272
## aBreastFeed_total_stopped  1 0.006   0.941
## SEX                        1 0.344   0.559
## GRP                        2 1.525   0.223
## aSUP_Fe                    1 0.640   0.426
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  1.403  1.654 6.243 0.00523
## s(FAM)      29.151 88.000 0.545 0.00980
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Fe'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error     t value
## (Intercept)               16.35324830  4.2672915  3.83223139
## GRPVG                     -0.03228986  2.2589447 -0.01429422
## GRPVN                      3.81566045  1.9597701  1.94699386
## SEXM                      -1.70386532  1.6357436 -1.04164572
## aBreastFeed_full_stopped  -6.44160947  5.0305166 -1.28050656
## aBreastFeed_full_duration  0.43812856  0.4071078  1.07619790
## aBreastFeed_total_stopped  0.74619741  2.5237432  0.29567089
## log2_age                   3.38235120  1.1119544  3.04180747
## aSUP_Fe                   -2.25049575  3.2316533 -0.69639147


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of transferin saturation. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of transferin saturation. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 3.8156604 -0.0254185 7.656739 0.0515355
VG vs OM -0.0322899 -4.4597402 4.395160 0.9885952
VN vs VG 3.8479503 -0.3714491 8.067350 0.0738694

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Fe'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.267
##   Unadjusted ICC: 0.224

i = i+1

2.2.20 aTRFINDEX

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aTRFINDEX"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                0.53819    0.35308   1.524 0.130645    
## aBreastFeed_full_stopped   1.24975    0.35765   3.494 0.000713 ***
## aBreastFeed_full_duration -0.03609    0.02932  -1.231 0.221197    
## aBreastFeed_total_stopped -0.11297    0.18110  -0.624 0.534212    
## SEXM                       0.12311    0.11718   1.051 0.295984    
## GRPVG                      0.04910    0.17859   0.275 0.783961    
## GRPVN                      0.01313    0.15245   0.086 0.931539    
## aSUP_Fe                    0.34523    0.24521   1.408 0.162294    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value    
## s(log2_age)  1.0      1 13.24 0.000437 ***
## s(FAM)      25.3     88  0.41 0.046792 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.348   Deviance explained = 51.3%
## GCV = 0.4758  Scale est. = 0.35308   n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 12.210 0.000713
## aBreastFeed_full_duration  1  1.516 0.221197
## aBreastFeed_total_stopped  1  0.389 0.534212
## SEX                        1  1.104 0.295984
## GRP                        2  0.040 0.961128
## aSUP_Fe                    1  1.982 0.162294
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value
## s(log2_age)  1.0    1.0 13.24 0.000437
## s(FAM)      25.3   88.0  0.41 0.046792
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code
gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               -0.51104    0.25552  -2.000   0.0483 *  
## aBreastFeed_full_stopped   1.09404    0.25811   4.239  5.1e-05 ***
## aBreastFeed_full_duration -0.03125    0.02080  -1.502   0.1362    
## aBreastFeed_total_stopped  0.08315    0.12879   0.646   0.5201    
## SEXM                       0.08382    0.08325   1.007   0.3165    
## GRPVG                      0.08239    0.12716   0.648   0.5185    
## GRPVN                      0.03516    0.10853   0.324   0.7467    
## aSUP_Fe                    0.37311    0.17428   2.141   0.0348 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(log2_age)  1.16  1.278 19.373 1.14e-05 ***
## s(FAM)      26.15 88.000  0.457   0.0212 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.406   Deviance explained =   56%
## GCV = 0.23947  Scale est. = 0.17589   n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F p-value
## aBreastFeed_full_stopped   1 17.966 5.1e-05
## aBreastFeed_full_duration  1  2.257  0.1362
## aBreastFeed_total_stopped  1  0.417  0.5201
## SEX                        1  1.014  0.3165
## GRP                        2  0.210  0.8110
## aSUP_Fe                    1  4.583  0.0348
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F  p-value
## s(log2_age)  1.160  1.278 19.373 1.14e-05
## s(FAM)      26.151 88.000  0.457   0.0212
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation slightly improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)


## main model 
mod_main <- run(
   rlme(main = TRUE,
       include = c('aSUP_Fe'), 
       remove_random = FALSE),
  
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)               -0.16175577 0.19376737 -0.8347937
## GRPVG                      0.07796802 0.10257321  0.7601207
## GRPVN                      0.03487963 0.08898842  0.3919570
## SEXM                       0.08290378 0.07427516  1.1161711
## aBreastFeed_full_stopped   1.00254138 0.22842357  4.3889576
## aBreastFeed_full_duration -0.03249353 0.01848578 -1.7577584
## aBreastFeed_total_stopped  0.21873192 0.11459707  1.9087043
## log2_age                  -0.27163505 0.05049116 -5.3798540
## aSUP_Fe                    0.33845706 0.14674155  2.3064842


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of TRF index. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of TRF index. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.0348796 -0.1395345 0.2092937 0.6950900
VG vs OM 0.0779680 -0.1230718 0.2790078 0.4471824
VN vs VG -0.0430884 -0.2346811 0.1485043 0.6593668

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Fe'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.238
##   Unadjusted ICC: 0.173

i = i+1

2.2.21 aSTRF

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Fe))

column_name
## [1] "aSTRF"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.19872    0.16067   7.461 4.14e-11 ***
## aBreastFeed_full_stopped   0.59769    0.16190   3.692 0.000373 ***
## aBreastFeed_full_duration -0.02013    0.01293  -1.557 0.122754    
## aBreastFeed_total_stopped  0.01538    0.08020   0.192 0.848344    
## SEXM                       0.03200    0.05178   0.618 0.537975    
## GRPVG                     -0.01915    0.08017  -0.239 0.811682    
## GRPVN                     -0.05715    0.06834  -0.836 0.405146    
## aSUP_Fe                    0.25874    0.10916   2.370 0.019808 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  1.255  1.428 13.86 0.000106 ***
## s(FAM)      29.216 88.000  0.51 0.021984 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.399   Deviance explained =   57%
## GCV = 0.092394  Scale est. = 0.065668  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 13.629 0.000373
## aBreastFeed_full_duration  1  2.425 0.122754
## aBreastFeed_total_stopped  1  0.037 0.848344
## SEX                        1  0.382 0.537975
## GRP                        2  0.373 0.689671
## aSUP_Fe                    1  5.618 0.019808
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  1.255  1.428 13.86 0.000106
## s(FAM)      29.216 88.000  0.51 0.021984
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code
gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Fe +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                0.287074   0.139400   2.059 0.042311 *  
## aBreastFeed_full_stopped   0.507889   0.140234   3.622 0.000481 ***
## aBreastFeed_full_duration -0.016738   0.010353  -1.617 0.109398    
## aBreastFeed_total_stopped  0.046116   0.064922   0.710 0.479318    
## SEXM                       0.020514   0.041702   0.492 0.623965    
## GRPVG                     -0.005776   0.064991  -0.089 0.929379    
## GRPVN                     -0.040734   0.055405  -0.735 0.464105    
## aSUP_Fe                    0.203685   0.087957   2.316 0.022816 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value    
## s(log2_age)  2.036  2.494 9.394 9.8e-05 ***
## s(FAM)      31.760 88.000 0.598  0.0093 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.433   Deviance explained = 60.8%
## GCV = 0.059366  Scale est. = 0.04071   n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Fe + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 13.117 0.000481
## aBreastFeed_full_duration  1  2.614 0.109398
## aBreastFeed_total_stopped  1  0.505 0.479318
## SEX                        1  0.242 0.623965
## GRP                        2  0.326 0.722962
## aSUP_Fe                    1  5.363 0.022816
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.036  2.494 9.394 9.8e-05
## s(FAM)      31.760 88.000 0.598  0.0093
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation slightly improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Fe'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                0.46926832 0.10968743  4.2782324
## GRPVG                      0.01172688 0.05806443  0.2019632
## GRPVN                     -0.02843617 0.05037438 -0.5644968
## SEXM                       0.04688096 0.04204553  1.1150048
## aBreastFeed_full_stopped   0.46105950 0.12930554  3.5656593
## aBreastFeed_full_duration -0.01650168 0.01046439 -1.5769366
## aBreastFeed_total_stopped  0.07303085 0.06487087  1.1257881
## log2_age                  -0.13238459 0.02858193 -4.6317584
## aSUP_Fe                    0.17206237 0.08306715  2.0713648


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(soluble transferin receptor level). CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(soluble transferin receptor level). CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.0284362 -0.1271681 0.0702958 0.5724161
VG vs OM 0.0117269 -0.1020773 0.1255311 0.8399455
VN vs VG -0.0401630 -0.1486194 0.0682934 0.4679587

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Fe'))
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Fe'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Fe'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Fe'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Fe'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Fe'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.262
##   Unadjusted ICC: 0.197

i = i+1

2.2.22 aHGB

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aHGB"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               122.74805    4.97703  24.663   <2e-16 ***
## aBreastFeed_full_stopped   -1.91829    4.98365  -0.385    0.701    
## aBreastFeed_full_duration   0.13677    0.35457   0.386    0.701    
## aBreastFeed_total_stopped   0.01832    2.27774   0.008    0.994    
## SEXM                       -1.56801    1.44375  -1.086    0.281    
## GRPVG                       1.80607    2.52695   0.715    0.477    
## GRPVN                       1.26256    2.11961   0.596    0.553    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  1.982  2.403 8.136 0.000579 ***
## s(FAM)      47.379 86.000 1.257 0.000201 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.573   Deviance explained = 75.9%
## GCV = 68.526  Scale est. = 38.353    n = 128
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.148   0.701
## aBreastFeed_full_duration  1 0.149   0.701
## aBreastFeed_total_stopped  1 0.000   0.994
## SEX                        1 1.180   0.281
## GRP                        2 0.289   0.750
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  1.982  2.403 8.136 0.000579
## s(FAM)      47.379 86.000 1.257 0.000201
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code
gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                6.932308   0.059156 117.188   <2e-16 ***
## aBreastFeed_full_stopped  -0.019780   0.059305  -0.334    0.740    
## aBreastFeed_full_duration  0.001477   0.004156   0.355    0.723    
## aBreastFeed_total_stopped  0.003080   0.026730   0.115    0.909    
## SEXM                      -0.019741   0.016954  -1.164    0.248    
## GRPVG                      0.019120   0.029492   0.648    0.519    
## GRPVN                      0.014925   0.024741   0.603    0.548    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.104  2.555 8.018 0.000566 ***
## s(FAM)      46.729 86.000 1.226 0.000238 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.572   Deviance explained = 75.7%
## GCV = 0.0094269  Scale est. = 0.0053149  n = 128
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.111   0.740
## aBreastFeed_full_duration  1 0.126   0.723
## aBreastFeed_total_stopped  1 0.013   0.909
## SEX                        1 1.356   0.248
## GRP                        2 0.258   0.773
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.104  2.555 8.018 0.000566
## s(FAM)      46.729 86.000 1.226 0.000238
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation did not improve the fit substantially. We will continue to work with original values.

Fit main model

Open code

## main <- model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)               121.9905347  3.8620803 31.5867422
## GRPVG                       1.1369635  2.1171548  0.5370243
## GRPVN                      -0.4399365  1.8315289 -0.2402018
## SEXM                       -1.0657992  1.5306017 -0.6963270
## aBreastFeed_full_stopped   -7.7221497  4.6434642 -1.6630148
## aBreastFeed_full_duration   0.2799735  0.3739679  0.7486564
## aBreastFeed_total_stopped  -0.5898554  2.3110586 -0.2552317
## log2_age                    4.3371619  1.0364402  4.1846716


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of hemoglobin level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of hemoglobin level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.4399365 -4.029667 3.149794 0.8101738
VG vs OM 1.1369635 -3.012584 5.286511 0.5912508
VN vs VG -1.5769001 -5.525154 2.371354 0.4337484

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.453
##   Unadjusted ICC: 0.366

i = i+1

2.2.23 aMCV

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aMCV"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                80.8549     1.6263  49.716  < 2e-16 ***
## aBreastFeed_full_stopped   -4.3842     1.6607  -2.640  0.00989 ** 
## aBreastFeed_full_duration   0.1343     0.1364   0.985  0.32750    
## aBreastFeed_total_stopped   0.5073     0.8466   0.599  0.55062    
## SEXM                       -0.6617     0.5486  -1.206  0.23114    
## GRPVG                       0.2353     0.9008   0.261  0.79461    
## GRPVN                       3.1766     0.7592   4.184 7.05e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(log2_age)  1.00      1 24.760 3.67e-06 ***
## s(FAM)      36.57     86  0.842  0.00114 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.551   Deviance explained = 70.5%
## GCV = 10.123  Scale est. = 6.5983    n = 128
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df      F p-value
## aBreastFeed_full_stopped   1  6.970 0.00989
## aBreastFeed_full_duration  1  0.970 0.32750
## aBreastFeed_total_stopped  1  0.359 0.55062
## SEX                        1  1.455 0.23114
## GRP                        2 11.204 4.9e-05
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value
## s(log2_age)  1.00   1.00 24.760 3.67e-06
## s(FAM)      36.57  86.00  0.842  0.00114
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                             Estimate Std. Error    t value
## (Intercept)               78.5030691  1.4242874 55.1174339
## GRPVG                     -0.1263542  0.9334602 -0.1353611
## GRPVN                      2.9747381  0.7853586  3.7877452
## SEXM                      -0.7844403  0.5540062 -1.4159413
## aBreastFeed_full_stopped  -4.4061689  1.6763600 -2.6284145
## aBreastFeed_full_duration  0.1364613  0.1379682  0.9890783
## aBreastFeed_total_stopped  0.1671523  0.8593625  0.1945073
## log2_age                   2.0174872  0.3859169  5.2277759


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of mean corpuscular volume. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of mean corpuscular volume. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 2.9747381 1.435464 4.514013 0.0001520
VG vs OM -0.1263542 -1.955903 1.703194 0.8923264
VN vs VG 3.1010923 1.429286 4.772899 0.0002773

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.383
##   Unadjusted ICC: 0.250

i = i+1

2.2.24 aPTH

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aPTH"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                1.740188   0.579897   3.001  0.00387 **
## aBreastFeed_full_stopped   0.912129   0.576207   1.583  0.11849   
## aBreastFeed_full_duration  0.003497   0.048368   0.072  0.94259   
## aBreastFeed_total_stopped  0.074439   0.306435   0.243  0.80887   
## SEXM                       0.192782   0.194414   0.992  0.32522   
## GRPVG                     -0.099803   0.390103  -0.256  0.79892   
## GRPVN                      0.329237   0.326115   1.010  0.31660   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value    
## s(log2_age)  1.00      1 2.601    0.112    
## s(FAM)      61.67     87 2.275 1.87e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.617   Deviance explained = 81.8%
## GCV = 1.2731  Scale est. = 0.60119   n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 2.506   0.118
## aBreastFeed_full_duration  1 0.005   0.943
## aBreastFeed_total_stopped  1 0.059   0.809
## SEX                        1 0.983   0.325
## GRP                        2 0.935   0.398
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F  p-value
## s(log2_age)  1.00   1.00 2.601    0.112
## s(FAM)      61.67  87.00 2.275 1.87e-06
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error     t value
## (Intercept)                2.067989450 0.49298273  4.19485174
## GRPVG                     -0.016995657 0.35350110 -0.04807809
## GRPVN                      0.375608950 0.29563721  1.27050633
## SEXM                       0.110783438 0.18881964  0.58671566
## aBreastFeed_full_stopped   0.747933025 0.56299165  1.32849754
## aBreastFeed_full_duration  0.007204965 0.04705299  0.15312449
## aBreastFeed_total_stopped -0.024917109 0.29353555 -0.08488617
## log2_age                  -0.136813968 0.12535498 -1.09141232


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of parathormone level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of parathormone level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.3756089 -0.2038293 0.9550472 0.2039043
VG vs OM -0.0169957 -0.7098451 0.6758538 0.9616540
VN vs VG 0.3926046 -0.2383901 1.0235993 0.2226588

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.600
##   Unadjusted ICC: 0.557
i = i+1

2.2.25 aCros

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Ca),
         !is.na(aSUP_D)) 

column_name
## [1] "aCros"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca +
    aSUP_D +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.052102   0.227458   4.625 1.29e-05 ***
## aBreastFeed_full_stopped   0.188844   0.236859   0.797    0.427    
## aBreastFeed_full_duration  0.013003   0.012868   1.010    0.315    
## aBreastFeed_total_stopped -0.110060   0.080126  -1.374    0.173    
## SEXM                      -0.006262   0.052808  -0.119    0.906    
## GRPVG                      0.036192   0.085240   0.425    0.672    
## GRPVN                      0.019901   0.070882   0.281    0.780    
## aSUP_Ca                   -0.173916   0.142349  -1.222    0.225    
## aSUP_D                     0.028563   0.065643   0.435    0.665    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value  
## s(log2_age)  7.75  8.448 2.517  0.0142 *
## s(FAM)      27.01 86.000 0.500  0.0168 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.36   Deviance explained = 57.1%
## GCV = 0.093688  Scale est. = 0.06239   n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.636   0.427
## aBreastFeed_full_duration  1 1.021   0.315
## aBreastFeed_total_stopped  1 1.887   0.173
## SEX                        1 0.014   0.906
## GRP                        2 0.092   0.912
## aSUP_Ca                    1 1.493   0.225
## aSUP_D                     1 0.189   0.665
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  7.750  8.448 2.517  0.0142
## s(FAM)      27.013 86.000 0.500  0.0168
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca +
    aSUP_D +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)
## (Intercept)               -0.005831   0.256938  -0.023    0.982
## aBreastFeed_full_stopped   0.242759   0.267456   0.908    0.367
## aBreastFeed_full_duration  0.012185   0.014464   0.842    0.402
## aBreastFeed_total_stopped -0.108066   0.089873  -1.202    0.232
## SEXM                       0.007417   0.059199   0.125    0.901
## GRPVG                      0.080354   0.094759   0.848    0.399
## GRPVN                      0.075429   0.078903   0.956    0.342
## aSUP_Ca                   -0.212942   0.159533  -1.335    0.185
## aSUP_D                     0.028037   0.073220   0.383    0.703
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  8.14  8.688 3.096 0.00385 **
## s(FAM)      24.66 86.000 0.442 0.02392 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.365   Deviance explained = 56.4%
## GCV = 0.11837  Scale est. = 0.080598  n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.824   0.367
## aBreastFeed_full_duration  1 0.710   0.402
## aBreastFeed_total_stopped  1 1.446   0.232
## SEX                        1 0.016   0.901
## GRP                        2 0.538   0.586
## aSUP_Ca                    1 1.782   0.185
## aSUP_D                     1 0.147   0.703
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  8.140  8.688 3.096 0.00385
## s(FAM)      24.663 86.000 0.442 0.02392
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Ca', 'aSUP_D'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error     t value
## (Intercept)                0.015552802 0.15963859  0.09742508
## GRPVG                      0.084899520 0.08989729  0.94440574
## GRPVN                      0.055022493 0.07511070  0.73255200
## SEXM                       0.053202344 0.06082597  0.87466496
## aBreastFeed_full_stopped   0.167486445 0.18624144  0.89929743
## aBreastFeed_full_duration  0.025272107 0.01503215  1.68120409
## aBreastFeed_total_stopped -0.045919713 0.09081623 -0.50563334
## log2_age                  -0.039324771 0.04174356 -0.94205596
## aSUP_Ca                   -0.199618968 0.16431147 -1.21488148
## aSUP_D                    -0.006484379 0.07229079 -0.08969855


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(beta cross laps). CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(beta cross laps). CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.0550225 -0.0921918 0.2022368 0.4638317
VG vs OM 0.0848995 -0.0912959 0.2610950 0.3449623
VN vs VG -0.0298770 -0.1905284 0.1307743 0.7154825

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Ca', 'aSUP_D'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Ca', 'aSUP_D'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Ca', 'aSUP_D'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.137
##   Unadjusted ICC: 0.126
i = i+1

2.2.26 aP1NP

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSUP_Ca),
         !is.na(aSUP_D)) 

column_name
## [1] "aP1NP"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca +
    aSUP_D +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                609.331    102.691   5.934 3.29e-08 ***
## aBreastFeed_full_stopped   119.996    109.182   1.099   0.2741    
## aBreastFeed_full_duration  -11.209      6.585  -1.702   0.0914 .  
## aBreastFeed_total_stopped   -1.849     40.760  -0.045   0.9639    
## SEXM                        34.222     26.788   1.277   0.2040    
## GRPVG                       67.598     39.787   1.699   0.0921 .  
## GRPVN                       60.960     33.120   1.841   0.0683 .  
## aSUP_Ca                     51.881     71.521   0.725   0.4697    
## aSUP_D                       3.413     31.727   0.108   0.9145    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F p-value    
## s(log2_age) 4.279   5.31 29.744  <2e-16 ***
## s(FAM)      4.328  86.00  0.053   0.377    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.757   Deviance explained = 78.8%
## GCV =  24348  Scale est. = 21075     n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + aSUP_D + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.208  0.2741
## aBreastFeed_full_duration  1 2.898  0.0914
## aBreastFeed_total_stopped  1 0.002  0.9639
## SEX                        1 1.632  0.2040
## GRP                        2 2.073  0.1305
## aSUP_Ca                    1 0.526  0.4697
## aSUP_D                     1 0.012  0.9145
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F p-value
## s(log2_age)  4.279  5.310 29.744  <2e-16
## s(FAM)       4.328 86.000  0.053   0.377
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_Ca', 'aSUP_D'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                972.884093  84.051812 11.5748140
## GRPVG                       63.171578  47.332104  1.3346455
## GRPVN                       41.883265  39.546769  1.0590819
## SEXM                        62.890873  32.025671  1.9637645
## aBreastFeed_full_stopped   106.687484  98.058559  1.0879977
## aBreastFeed_full_duration  -15.038083   7.914623 -1.9000379
## aBreastFeed_total_stopped  -58.784503  47.815935 -1.2293915
## log2_age                  -203.808015  21.978533 -9.2730492
## aSUP_Ca                     77.181600  86.512147  0.8921476
## aSUP_D                       4.410471  38.062050  0.1158758


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of procollagen type I level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of procollagen type I level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 41.88327 -35.62698 119.39351 0.2895625
VG vs OM 63.17158 -29.59764 155.94080 0.1819924
VN vs VG -21.28831 -105.87337 63.29674 0.6218130

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Ca', 'aSUP_D'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Ca', 'aSUP_D'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Ca', 'aSUP_D'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Ca', 'aSUP_D'),
        exclude = c('log2_age'))
## boundary (singular) fit: see help('isSingular')

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.231
##   Unadjusted ICC: 0.067

i = i+1

2.2.27 aUI

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]



dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aUI"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                257.222     76.274   3.372  0.00109 **
## aBreastFeed_full_stopped   -63.013     75.396  -0.836  0.40547   
## aBreastFeed_full_duration   -4.366      7.030  -0.621  0.53604   
## aBreastFeed_total_stopped    4.090     44.541   0.092  0.92704   
## SEXM                        25.113     27.523   0.912  0.36393   
## GRPVG                      -13.366     40.188  -0.333  0.74020   
## GRPVN                      -22.580     35.961  -0.628  0.53162   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00      1 0.373   0.543
## s(FAM)      12.47     76 0.204   0.165
## 
## R-sq.(adj) =  0.108   Deviance explained = 26.5%
## GCV =  22266  Scale est. = 18196     n = 112
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.698   0.405
## aBreastFeed_full_duration  1 0.386   0.536
## aBreastFeed_total_stopped  1 0.008   0.927
## SEX                        1 0.833   0.364
## GRP                        2 0.197   0.821
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 0.373   0.543
## s(FAM)      12.47  76.00 0.204   0.165
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                8.04257    0.63343  12.697  < 2e-16 ***
## aBreastFeed_full_stopped  -0.75174    0.60297  -1.247  0.21674    
## aBreastFeed_full_duration  0.02367    0.05582   0.424  0.67288    
## aBreastFeed_total_stopped -0.14546    0.36373  -0.400  0.69047    
## SEXM                       0.32294    0.22369   1.444  0.15338    
## GRPVG                     -0.39657    0.36937  -1.074  0.28675    
## GRPVN                     -0.98035    0.32468  -3.019  0.00355 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 0.028 0.86820   
## s(FAM)      35.42     76 0.857 0.00589 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.428   Deviance explained = 64.7%
## GCV = 1.4473  Scale est. = 0.88624   n = 112
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.554  0.2167
## aBreastFeed_full_duration  1 0.180  0.6729
## aBreastFeed_total_stopped  1 0.160  0.6905
## SEX                        1 2.084  0.1534
## GRP                        2 4.881  0.0104
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 0.028 0.86820
## s(FAM)      35.42  76.00 0.857 0.00589
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation better

Fit main model

Open code
dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                                Estimate Std. Error     t value
## (Intercept)                7.9429665929 0.40575072 19.57597642
## GRPVG                     -0.3741851415 0.25172558 -1.48648040
## GRPVN                     -0.7851984224 0.22803581 -3.44331196
## SEXM                       0.2823316997 0.18153910  1.55521156
## aBreastFeed_full_stopped  -0.6161899099 0.50279201 -1.22553639
## aBreastFeed_full_duration  0.0006774979 0.04665396  0.01452177
## aBreastFeed_total_stopped -0.1679532753 0.29429818 -0.57069084
## log2_age                   0.0868093119 0.13085245  0.66341374


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of urinary iodine level . CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of urinary iodine level . CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.7851984 -1.2321404 -0.3382565 0.0005746
VG vs OM -0.3741851 -0.8675582 0.1191879 0.1371521
VN vs VG -0.4110133 -0.8597072 0.0376807 0.0725949

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.304
##   Unadjusted ICC: 0.261

i = i+1

2.2.28 aUREA

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name),
    log2_age_2 = log2_age**2) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aUREA"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                3.45083    0.58839   5.865  8.1e-08 ***
## aBreastFeed_full_stopped   1.02339    0.60190   1.700  0.09268 .  
## aBreastFeed_full_duration -0.06073    0.04243  -1.431  0.15592    
## aBreastFeed_total_stopped  0.43588    0.26308   1.657  0.10118    
## SEXM                       0.28655    0.17168   1.669  0.09874 .  
## GRPVG                     -0.18911    0.27425  -0.690  0.49233    
## GRPVN                     -0.61662    0.23324  -2.644  0.00974 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value   
## s(log2_age)  2.612  3.183 4.241 0.00616 **
## s(FAM)      37.197 88.000 0.778 0.00258 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.579   Deviance explained = 72.5%
## GCV = 0.99419  Scale est. = 0.64429   n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 2.891  0.0927
## aBreastFeed_full_duration  1 2.049  0.1559
## aBreastFeed_total_stopped  1 2.745  0.1012
## SEX                        1 2.786  0.0987
## GRP                        2 3.799  0.0262
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.612  3.183 4.241 0.00616
## s(FAM)      37.197 88.000 0.778 0.00258
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, include = c('log2_age_2'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                3.44892379 0.51258305  6.7285171
## GRPVG                     -0.17017796 0.24549610 -0.6932002
## GRPVN                     -0.54650504 0.21336673 -2.5613414
## SEXM                       0.23096388 0.17993118  1.2836234
## aBreastFeed_full_stopped   0.31493160 0.60853357  0.5175254
## aBreastFeed_full_duration -0.07411729 0.04429197 -1.6733800
## aBreastFeed_total_stopped  0.45065791 0.27139231  1.6605405
## log2_age                   1.12052504 0.27356103  4.0960697
## log2_age_2                -0.28005981 0.08305899 -3.3718183


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of urea level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of urea level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.5465050 -0.9646962 -0.1283139 0.0104269
VG vs OM -0.1701780 -0.6513415 0.3109856 0.4881839
VN vs VG -0.3763271 -0.8346064 0.0819522 0.1075133

Leave-one-factor models

Open code

### main model but non-robust
mod_main <- rlme(include = c('log2_age_2'),remove_random = TRUE)
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('log2_age_2'),exclude = c('SEX'),
                 remove_random = TRUE)

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('log2_age_2'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'),  
                remove_random = TRUE)

### model without diet groups
mod_ndiet <- rlme(include = c('log2_age_2'),
         exclude = c('GRP'),
                  remove_random = TRUE)

### model without log2_age_2
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'),
          include = c('log2_age_2'),remove_random = TRUE)

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## Warning: `model` has no random effects.
## NULL

i = i+1

2.2.29 aCREA

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aCREA"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(aAGE) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(aAGE) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                26.0851     2.0091  12.984   <2e-16 ***
## aBreastFeed_full_stopped    2.1944     2.2108   0.993    0.323    
## aBreastFeed_full_duration  -0.2018     0.1852  -1.090    0.278    
## aBreastFeed_total_stopped  -0.6274     1.0611  -0.591    0.556    
## SEXM                        0.3388     0.7476   0.453    0.651    
## GRPVG                      -0.5692     1.1110  -0.512    0.610    
## GRPVN                      -1.3896     0.9561  -1.453    0.149    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##            edf Ref.df      F p-value    
## s(aAGE)  1.518  1.848 98.142  <2e-16 ***
## s(FAM)  21.927 88.000  0.393  0.0147 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.738   Deviance explained = 79.7%
## GCV = 19.084  Scale est. = 14.716    n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(aAGE) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.985   0.323
## aBreastFeed_full_duration  1 1.187   0.278
## aBreastFeed_total_stopped  1 0.350   0.556
## SEX                        1 0.205   0.651
## GRP                        2 1.089   0.341
## 
## Approximate significance of smooth terms:
##            edf Ref.df      F p-value
## s(aAGE)  1.518  1.848 98.142  <2e-16
## s(FAM)  21.927 88.000  0.393  0.0147
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(aAGE) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(aAGE) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                4.623918   0.114019  40.554   <2e-16 ***
## aBreastFeed_full_stopped   0.134109   0.123128   1.089    0.279    
## aBreastFeed_full_duration -0.009475   0.010274  -0.922    0.359    
## aBreastFeed_total_stopped -0.029880   0.060658  -0.493    0.623    
## SEXM                       0.018897   0.041596   0.454    0.651    
## GRPVG                     -0.028555   0.061805  -0.462    0.645    
## GRPVN                     -0.078300   0.053069  -1.475    0.143    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##            edf Ref.df      F p-value    
## s(aAGE)  2.063  2.552 57.442  <2e-16 ***
## s(FAM)  22.621 88.000  0.386  0.0262 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.722   Deviance explained = 78.7%
## GCV = 0.058624  Scale est. = 0.044658  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(aAGE) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.186   0.279
## aBreastFeed_full_duration  1 0.851   0.359
## aBreastFeed_total_stopped  1 0.243   0.623
## SEX                        1 0.206   0.651
## GRP                        2 1.142   0.323
## 
## Approximate significance of smooth terms:
##            edf Ref.df      F p-value
## s(aAGE)  2.063  2.552 57.442  <2e-16
## s(FAM)  22.621 88.000  0.386  0.0262
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                4.46250973 0.11528960 38.7069584
## GRPVG                     -0.05970886 0.06193321 -0.9640847
## GRPVN                     -0.10935941 0.05369952 -2.0365062
## SEXM                       0.03302511 0.04484794  0.7363795
## aBreastFeed_full_stopped  -0.11770817 0.13704639 -0.8588929
## aBreastFeed_full_duration -0.00847680 0.01116296 -0.7593686
## aBreastFeed_total_stopped -0.07609532 0.06716510 -1.1329593
## log2_age                   0.31510609 0.02944252 10.7024174


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(creatinine) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(creatinine) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1093594 -0.2146085 -0.0041103 0.0416995
VG vs OM -0.0597089 -0.1810957 0.0616780 0.3350035
VN vs VG -0.0496505 -0.1653684 0.0660673 0.4003741

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.169
##   Unadjusted ICC: 0.062

i = i+1

2.2.30 aUA

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aUA"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               192.3535    23.0389   8.349 1.51e-12 ***
## aBreastFeed_full_stopped   30.9160    23.5010   1.316    0.192    
## aBreastFeed_full_duration   0.8395     1.9583   0.429    0.669    
## aBreastFeed_total_stopped   5.5272    11.9410   0.463    0.645    
## SEXM                        3.2065     7.8328   0.409    0.683    
## GRPVG                      -5.1456    13.1704  -0.391    0.697    
## GRPVN                      -7.8236    11.1575  -0.701    0.485    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value    
## s(log2_age)  1.0      1 0.154 0.695498    
## s(FAM)      43.8     88 1.093 0.000231 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.452   Deviance explained = 66.3%
## GCV = 2095.7  Scale est. = 1279.5    n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.731   0.192
## aBreastFeed_full_duration  1 0.184   0.669
## aBreastFeed_total_stopped  1 0.214   0.645
## SEX                        1 0.168   0.683
## GRP                        2 0.246   0.782
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F  p-value
## s(log2_age)  1.0    1.0 0.154 0.695498
## s(FAM)      43.8   88.0 1.093 0.000231
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)               191.5011255  18.983604 10.0877117
## GRPVG                      -4.9438565  13.230094 -0.3736826
## GRPVN                      -2.3808811  11.184973 -0.2128643
## SEXM                        3.9727707   7.279530  0.5457455
## aBreastFeed_full_stopped   34.5743917  21.691870  1.5938871
## aBreastFeed_full_duration  -0.9849698   1.817146 -0.5420421
## aBreastFeed_total_stopped   2.9853057  11.255966  0.2652199
## log2_age                    4.7060688   4.838499  0.9726299


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of uric acid level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of uric acid level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -2.380881 -24.30302 19.54126 0.8314328
VG vs OM -4.943856 -30.87436 20.98665 0.7086404
VN vs VG 2.562975 -20.97969 26.10564 0.8310370

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.435
##   Unadjusted ICC: 0.411

i = i+1

2.2.31 aVIT_AKTB12

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name),
    log2_age_2 = log2_age**2) %>% 
  filter(!is.na(outcome),
    !is.na(aSup_B12))

column_name
## [1] "aVIT_AKTB12"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age_2) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age_2) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                 48.089     30.648   1.569   0.1216  
## aBreastFeed_full_stopped    79.211     33.534   2.362   0.0212 *
## aBreastFeed_full_duration   -4.839      2.764  -1.751   0.0847 .
## aBreastFeed_total_stopped    6.203     17.401   0.356   0.7227  
## SEXM                        -8.574     11.489  -0.746   0.4582  
## GRPVG                      -17.486     25.055  -0.698   0.4878  
## GRPVN                       34.524     24.722   1.396   0.1674  
## aSup_B12                    48.391     21.189   2.284   0.0257 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                 edf Ref.df     F  p-value    
## s(log2_age_2)  2.69  3.223 2.096    0.105    
## s(FAM)        56.27 86.000 2.160 9.07e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.698   Deviance explained = 85.1%
## GCV = 4239.6  Scale est. = 2072.5    n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age_2) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 5.580  0.0212
## aBreastFeed_full_duration  1 3.066  0.0847
## aBreastFeed_total_stopped  1 0.127  0.7227
## SEX                        1 0.557  0.4582
## GRP                        2 3.611  0.0327
## aSup_B12                   1 5.215  0.0257
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F  p-value
## s(log2_age_2)  2.690  3.223 2.096    0.105
## s(FAM)        56.273 86.000 2.160 9.07e-07
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age_2) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age_2) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                5.43921    0.28279  19.234  < 2e-16 ***
## aBreastFeed_full_stopped   1.29622    0.31183   4.157 9.33e-05 ***
## aBreastFeed_full_duration -0.04785    0.02570  -1.862   0.0670 .  
## aBreastFeed_total_stopped  0.19678    0.15976   1.232   0.2224    
## SEXM                      -0.11325    0.10658  -1.063   0.2918    
## GRPVG                     -0.24065    0.22615  -1.064   0.2911    
## GRPVN                      0.26439    0.22321   1.185   0.2404    
## aSup_B12                   0.49931    0.19189   2.602   0.0114 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F p-value    
## s(log2_age_2)  2.506  3.016 2.767  0.0483 *  
## s(FAM)        53.226 86.000 1.902 2.1e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.716   Deviance explained = 85.3%
## GCV = 0.36607  Scale est. = 0.18798   n = 131
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age_2) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 17.279 9.33e-05
## aBreastFeed_full_duration  1  3.466   0.0670
## aBreastFeed_total_stopped  1  1.517   0.2224
## SEX                        1  1.129   0.2918
## GRP                        2  4.088   0.0211
## aSup_B12                   1  6.771   0.0114
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F p-value
## s(log2_age_2)  2.506  3.016 2.767  0.0483
## s(FAM)        53.226 86.000 1.902 2.1e-06
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,  include = c('log2_age_2', 'aSup_B12'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                5.81633638 0.30177105 19.2740040
## GRPVG                     -0.31206834 0.21407573 -1.4577474
## GRPVN                      0.29838441 0.20920353  1.4262877
## SEXM                      -0.06554295 0.09910917 -0.6613207
## aBreastFeed_full_stopped   0.61497174 0.34869992  1.7636131
## aBreastFeed_full_duration -0.04817121 0.02368617 -2.0337273
## aBreastFeed_total_stopped  0.13626617 0.15186935  0.8972592
## log2_age                   0.49929784 0.15266233  3.2706028
## log2_age_2                -0.12352357 0.04658375 -2.6516451
## aSup_B12                   0.49176271 0.17968360  2.7368258


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(active vitamin B12) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(active vitamin B12) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.2983844 -0.1116470 0.7084158 0.1537853
VG vs OM -0.3120683 -0.7316490 0.1075124 0.1449102
VN vs VG 0.6104527 0.2832585 0.9376470 0.0002554

Leave-one-factor models

Open code

### main model but non-robust
mod_main <- rlme(include = c('log2_age_2', 'aSup_B12'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('log2_age_2', 'aSup_B12'),
                   remove_random = TRUE)

### model without sex
mod_nsex <- rlme( include = c('log2_age_2', 'aSup_B12'),
                  exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(
  include = c('log2_age_2', 'aSup_B12'),
  exclude = c('aBreastFeed_full_stopped',
                  'aBreastFeed_total_stopped',
              'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(
  include = c('log2_age_2', 'aSup_B12'),
  exclude = c('GRP'))

### model without other cov - log2_age_2 and supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(
  exclude = c('log2_age'),
  include = c('log2_age_2', 'aSup_B12'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.578
##   Unadjusted ICC: 0.346
i = i+1

2.2.32 aHCY

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSup_B12))

column_name
## [1] "aHCY"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               14.15455    1.51935   9.316 1.36e-14 ***
## aBreastFeed_full_stopped  -4.86855    1.55739  -3.126  0.00243 ** 
## aBreastFeed_full_duration -0.05625    0.10825  -0.520  0.60470    
## aBreastFeed_total_stopped -0.42777    0.68795  -0.622  0.53575    
## SEXM                       0.94094    0.46216   2.036  0.04490 *  
## GRPVG                      0.38210    0.90201   0.424  0.67293    
## GRPVN                     -0.31762    0.86082  -0.369  0.71307    
## aSup_B12                  -1.28970    0.77513  -1.664  0.09987 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value  
## s(log2_age)  2.111  2.602 1.393  0.1728  
## s(FAM)      24.802 78.000 0.517  0.0155 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.521   Deviance explained = 65.8%
## GCV = 6.4673  Scale est. = 4.5699    n = 119
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 9.772 0.00243
## aBreastFeed_full_duration  1 0.270 0.60470
## aBreastFeed_total_stopped  1 0.387 0.53575
## SEX                        1 4.145 0.04490
## GRP                        2 0.507 0.60435
## aSup_B12                   1 2.768 0.09987
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.111  2.602 1.393  0.1728
## s(FAM)      24.802 78.000 0.517  0.0155
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                3.75616    0.21431  17.526   <2e-16 ***
## aBreastFeed_full_stopped  -0.51366    0.21638  -2.374   0.0206 *  
## aBreastFeed_full_duration -0.01620    0.01507  -1.074   0.2866    
## aBreastFeed_total_stopped -0.07472    0.09806  -0.762   0.4488    
## SEXM                       0.15144    0.06501   2.330   0.0230 *  
## GRPVG                      0.04349    0.14122   0.308   0.7591    
## GRPVN                     -0.14554    0.13490  -1.079   0.2847    
## aSup_B12                  -0.20281    0.11970  -1.694   0.0950 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.027  2.458 1.548    0.138    
## s(FAM)      44.700 78.000 1.449 9.16e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.66   Deviance explained = 81.5%
## GCV = 0.12525  Scale est. = 0.067648  n = 119
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 5.636  0.0206
## aBreastFeed_full_duration  1 1.155  0.2866
## aBreastFeed_total_stopped  1 0.581  0.4488
## SEX                        1 5.427  0.0230
## GRP                        2 1.612  0.2074
## aSup_B12                   1 2.871  0.0950
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.027  2.458 1.548    0.138
## s(FAM)      44.700 78.000 1.449 9.16e-05
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, 
       include = c('aSup_B12'),
       remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                3.764273420 0.17399320 21.6346003
## GRPVG                      0.015233446 0.12441901  0.1224366
## GRPVN                     -0.144774193 0.11861983 -1.2204890
## SEXM                       0.213406760 0.06763601  3.1552242
## aBreastFeed_full_stopped  -0.699411784 0.20640200 -3.3885901
## aBreastFeed_full_duration  0.003676956 0.01609629  0.2284350
## aBreastFeed_total_stopped -0.081000594 0.10000649 -0.8099534
## log2_age                   0.020740662 0.04524239  0.4584343
## aSup_B12                  -0.183737288 0.10861244 -1.6916782


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(homocysteine) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(homocysteine) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1447742 -0.3772648 0.0877164 0.2222796
VG vs OM 0.0152334 -0.2286233 0.2590902 0.9025532
VN vs VG -0.1600076 -0.3459109 0.0258957 0.0916134

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c("aSup_B12"))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(
  include = c("aSup_B12"),
  remove_random = TRUE
)

### model without sex
mod_nsex <- rlme(
  include = c("aSup_B12"),
  exclude = c("SEX")
)

### model without breastfeeding-related predictors
mod_nbf <- rlme(
  include = c("aSup_B12"),
  exclude = c(
    "aBreastFeed_full_stopped",
    "aBreastFeed_total_stopped",
    "aBreastFeed_full_duration"
  )
)

### model without diet groups
mod_ndiet <- rlme(include = c("aSup_B12"), exclude = c("GRP"))

### model without other cov.
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(
  include = c("aSup_B12"),
  exclude = c("log2_age")
)

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.528
##   Unadjusted ICC: 0.335
i = i+1

2.2.33 aMMA

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
         !is.na(aSup_B12))

column_name
## [1] "aMMA"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               1202.485    166.634   7.216 7.73e-11 ***
## aBreastFeed_full_stopped  -841.224    168.529  -4.992 2.31e-06 ***
## aBreastFeed_full_duration   -4.972     12.373  -0.402    0.689    
## aBreastFeed_total_stopped  -18.384     79.248  -0.232    0.817    
## SEXM                         7.104     51.644   0.138    0.891    
## GRPVG                       30.632     90.395   0.339    0.735    
## GRPVN                      -89.296     88.826  -1.005    0.317    
## aSup_B12                   -79.658     77.861  -1.023    0.309    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age) 1.000      1 1.730   0.191
## s(FAM)      5.775     82 0.076   0.347
## 
## R-sq.(adj) =  0.401   Deviance explained = 46.8%
## GCV =  85718  Scale est. = 75422     n = 123
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 24.916 2.31e-06
## aBreastFeed_full_duration  1  0.162    0.689
## aBreastFeed_total_stopped  1  0.054    0.817
## SEX                        1  0.019    0.891
## GRP                        2  1.613    0.204
## aSup_B12                   1  1.047    0.309
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  1.000  1.000 1.730   0.191
## s(FAM)       5.775 82.000 0.076   0.347
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSup_B12 +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                9.77351    0.43056  22.700  < 2e-16 ***
## aBreastFeed_full_stopped  -1.47317    0.42272  -3.485 0.000847 ***
## aBreastFeed_full_duration -0.02739    0.02982  -0.919 0.361309    
## aBreastFeed_total_stopped -0.14448    0.19600  -0.737 0.463470    
## SEXM                       0.10635    0.12521   0.849 0.398537    
## GRPVG                     -0.01633    0.25586  -0.064 0.949293    
## GRPVN                     -0.50984    0.25349  -2.011 0.048095 *  
## aSup_B12                  -0.33286    0.21754  -1.530 0.130424    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  1.349  1.557 1.019 0.270367    
## s(FAM)      42.597 82.000 1.138 0.000462 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.66   Deviance explained = 80.2%
## GCV = 0.48894  Scale est. = 0.28245   n = 123
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSup_B12 + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1 12.145 0.000847
## aBreastFeed_full_duration  1  0.844 0.361309
## aBreastFeed_total_stopped  1  0.543 0.463470
## SEX                        1  0.721 0.398537
## GRP                        2  3.892 0.024889
## aSup_B12                   1  2.341 0.130424
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  1.349  1.557 1.019 0.270367
## s(FAM)      42.597 82.000 1.138 0.000462
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. Even though it’s not perfect we will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
  rlme(
    main = TRUE, 
    include = c('aSup_B12'),
    remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error    t value
## (Intercept)                9.80393388 0.28533468 34.3594188
## GRPVG                     -0.06652829 0.21863678 -0.3042868
## GRPVN                     -0.57940865 0.21730384 -2.6663525
## SEXM                       0.15066327 0.09579644  1.5727440
## aBreastFeed_full_stopped  -1.45766120 0.31105850 -4.6861320
## aBreastFeed_full_duration -0.02594862 0.02282983 -1.1366101
## aBreastFeed_total_stopped -0.15571146 0.15163880 -1.0268576
## log2_age                  -0.08152703 0.06647689 -1.2263966
## aSup_B12                  -0.25480297 0.18494890 -1.3776939


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(methylmalonic acid) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(methylmalonic acid) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.5794087 -1.0053164 -0.1535009 0.0076679
VG vs OM -0.0665283 -0.4950485 0.3619919 0.7609094
VN vs VG -0.5128804 -0.8398318 -0.1859289 0.0021082

Leave-one-factor mixed models

Open code
### main model but non-robust
mod_main <- rlme(include = c("aSup_B12"))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(
  include = c("aSup_B12"),
  remove_random = TRUE
)

### model without sex
mod_nsex <- rlme(
  include = c("aSup_B12"),
  exclude = c("SEX")
)

### model without breastfeeding-related predictors
mod_nbf <- rlme(
  include = c("aSup_B12"),
  exclude = c(
    "aBreastFeed_full_stopped",
    "aBreastFeed_total_stopped",
    "aBreastFeed_full_duration"
  )
)

### model without diet groups
mod_ndiet <- rlme(include = c("aSup_B12"), exclude = c("GRP"))

### model without other cov.
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(
  include = c("aSup_B12"),
  exclude = c("log2_age")
)

Putting key results together

Open code
AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.437
##   Unadjusted ICC: 0.250
i = i+1

2.2.34 aVIT_D

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i + ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>%
  filter(
    !is.na(outcome),
    !is.na(aSUP_D)
  )

column_name
## [1] "aVIT_D"

gamm <- gam(
  outcome ~
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped +
    SEX +
    GRP +
    aSUP_D +
    s(log2_age) +
    s(FAM, bs = "re"),
  data = dat_mod
)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_D + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                71.6826    14.7562   4.858 7.97e-06 ***
## aBreastFeed_full_stopped   16.7005    14.5626   1.147   0.2557    
## aBreastFeed_full_duration  -1.3528     1.0589  -1.278   0.2060    
## aBreastFeed_total_stopped  -0.7997     6.7279  -0.119   0.9058    
## SEXM                        1.2436     4.3318   0.287   0.7750    
## GRPVG                      10.6565     8.4490   1.261   0.2118    
## GRPVN                      12.6014     7.0857   1.778   0.0801 .  
## aSUP_D                      7.3784     6.0060   1.229   0.2237    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value    
## s(log2_age)  2.23  2.697 1.500   0.174    
## s(FAM)      58.59 88.000 2.429  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.682   Deviance explained = 84.5%
## GCV = 610.63  Scale est. = 294.66    n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_D + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.315   0.256
## aBreastFeed_full_duration  1 1.632   0.206
## aBreastFeed_total_stopped  1 0.014   0.906
## SEX                        1 0.082   0.775
## GRP                        2 1.628   0.204
## aSUP_D                     1 1.509   0.224
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.230  2.697 1.500   0.174
## s(FAM)      58.591 88.000 2.429  <2e-16
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped +
    SEX +
    GRP +
    aSUP_D +
    s(log2_age) +
    s(FAM, bs = "re"),
  data = dat_mod
)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_D + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                6.116517   0.215800  28.343   <2e-16 ***
## aBreastFeed_full_stopped   0.213326   0.213165   1.001   0.3206    
## aBreastFeed_full_duration -0.020286   0.015687  -1.293   0.2005    
## aBreastFeed_total_stopped  0.000527   0.099134   0.005   0.9958    
## SEXM                       0.015259   0.064025   0.238   0.8124    
## GRPVG                      0.214343   0.123161   1.740   0.0865 .  
## GRPVN                      0.248801   0.103243   2.410   0.0188 *  
## aSUP_D                     0.138762   0.088189   1.573   0.1204    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value    
## s(log2_age)  2.12  2.564 1.721   0.134    
## s(FAM)      57.07 88.000 2.294  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.692   Deviance explained = 84.6%
## GCV = 0.13393  Scale est. = 0.066274  n = 133
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_D + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 1.002  0.3206
## aBreastFeed_full_duration  1 1.672  0.2005
## aBreastFeed_total_stopped  1 0.000  0.9958
## SEX                        1 0.057  0.8124
## GRP                        2 3.007  0.0563
## aSUP_D                     1 2.476  0.1204
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  2.120  2.564 1.721   0.134
## s(FAM)      57.068 88.000 2.294  <2e-16
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_D'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error    t value
## (Intercept)                6.148132007 0.16469708 37.3299384
## GRPVG                      0.264362046 0.12385801  2.1343961
## GRPVN                      0.282685631 0.10385865  2.7218303
## SEXM                       0.008431574 0.06062285  0.1390824
## aBreastFeed_full_stopped   0.235689045 0.17659025  1.3346662
## aBreastFeed_full_duration -0.014773044 0.01500834 -0.9843220
## aBreastFeed_total_stopped  0.048562023 0.09383121  0.5175466
## log2_age                  -0.109857130 0.03968015 -2.7685667
## aSUP_D                     0.137831420 0.08628554  1.5973873


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(vitamin D serum) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(vitamin D serum) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.2826856 0.0791264 0.4862449 0.0064921
VG vs OM 0.2643620 0.0216048 0.5071193 0.0328104
VN vs VG 0.0183236 -0.1920903 0.2287375 0.8644749

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_D'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_D'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_D'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_D'),
        exclude = c('aBreastFeed_full_stopped',
                    'aBreastFeed_total_stopped',
                    'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_D'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_D'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.630
##   Unadjusted ICC: 0.526
i = i+1

2.2.35 aFOLAT

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i + ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>%
  filter(
    !is.na(outcome),
    !is.na(aSUP_FOL)
  )

column_name
## [1] "aFOLAT"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_FOL +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_FOL + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               11.62157    1.95942   5.931 8.14e-08 ***
## aBreastFeed_full_stopped   1.28782    1.98910   0.647 0.519282    
## aBreastFeed_full_duration  0.01750    0.16778   0.104 0.917192    
## aBreastFeed_total_stopped  0.86410    1.01781   0.849 0.398537    
## SEXM                       0.08888    0.66824   0.133 0.894540    
## GRPVG                      4.49460    1.14098   3.939 0.000179 ***
## GRPVN                      3.93983    0.96862   4.067 0.000114 ***
## aSUP_FOL                   1.81436    2.23353   0.812 0.419118    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(log2_age)  1.00      1 13.997 0.000351 ***
## s(FAM)      46.31     87  1.232 0.000126 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.598   Deviance explained = 76.5%
## GCV = 15.025  Scale est. = 8.7289    n = 132
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_FOL + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1  0.419    0.519
## aBreastFeed_full_duration  1  0.011    0.917
## aBreastFeed_total_stopped  1  0.721    0.399
## SEX                        1  0.018    0.895
## GRP                        2 10.542 9.02e-05
## aSUP_FOL                   1  0.660    0.419
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value
## s(log2_age)  1.00   1.00 13.997 0.000351
## s(FAM)      46.31  87.00  1.232 0.000126
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE,include = c('aSUP_FOL'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error      t value
## (Intercept)               13.630219443  1.8204170  7.487416192
## GRPVG                      4.813234906  1.1230674  4.285793270
## GRPVN                      4.298596597  0.9569665  4.491898825
## SEXM                       0.001631222  0.7094717  0.002299207
## aBreastFeed_full_stopped   1.476177673  2.1300517  0.693024340
## aBreastFeed_full_duration  0.015695775  0.1787616  0.087802843
## aBreastFeed_total_stopped  0.775496029  1.0694023  0.725167713
## log2_age                  -1.662083249  0.4696117 -3.539271521
## aSUP_FOL                   1.857874466  2.2798967  0.814894141


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of folate serum level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of folate serum level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 4.2985966 2.422977 6.174216 0.0000071
VG vs OM 4.8132349 2.612063 7.014407 0.0000182
VN vs VG -0.5146383 -2.547852 1.518576 0.6198255

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_FOL'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_FOL'),remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_FOL'),exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_FOL'),
        exclude = c('aBreastFeed_full_stopped',
                             'aBreastFeed_total_stopped',
                             'aBreastFeed_full_duration'))

### model without  diet groups
mod_ndiet <- rlme(include = c('aSUP_FOL'),
         exclude = c('GRP'))

### model without supplementation
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_FOL'),
        exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.463
##   Unadjusted ICC: 0.330
i = i+1

2.2.36 aIGF1

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aIGF1"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                113.662     27.020   4.207  7.3e-05 ***
## aBreastFeed_full_stopped    20.085     27.641   0.727    0.470    
## aBreastFeed_full_duration   -1.454      1.830  -0.794    0.430    
## aBreastFeed_total_stopped  -12.017     11.786  -1.020    0.311    
## SEXM                        -1.821      7.339  -0.248    0.805    
## GRPVG                       -0.817     12.225  -0.067    0.947    
## GRPVN                       -7.920     10.230  -0.774    0.441    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F p-value    
## s(log2_age)  3.376  4.064 26.368 < 2e-16 ***
## s(FAM)      38.932 81.000  0.985 0.00103 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.75   Deviance explained =   85%
## GCV = 1713.1  Scale est. = 1020.7    n = 122
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.528   0.470
## aBreastFeed_full_duration  1 0.631   0.430
## aBreastFeed_total_stopped  1 1.040   0.311
## SEX                        1 0.062   0.805
## GRP                        2 0.367   0.694
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F p-value
## s(log2_age)  3.376  4.064 26.368 < 2e-16
## s(FAM)      38.932 81.000  0.985 0.00103
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                6.57787    0.29733  22.123   <2e-16 ***
## aBreastFeed_full_stopped   0.50552    0.29694   1.702   0.0934 .  
## aBreastFeed_full_duration -0.04672    0.02357  -1.982   0.0516 .  
## aBreastFeed_total_stopped -0.19417    0.15091  -1.287   0.2027    
## SEXM                      -0.02883    0.09350  -0.308   0.7588    
## GRPVG                      0.02395    0.16764   0.143   0.8868    
## GRPVN                     -0.13797    0.14000  -0.985   0.3280    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F  p-value    
## s(log2_age)  1.499  1.763 55.394  < 2e-16 ***
## s(FAM)      47.256 81.000  1.484 6.47e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.778   Deviance explained = 87.8%
## GCV = 0.28151  Scale est. = 0.15286   n = 122
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 2.898  0.0934
## aBreastFeed_full_duration  1 3.928  0.0516
## aBreastFeed_total_stopped  1 1.656  0.2027
## SEX                        1 0.095  0.7588
## GRP                        2 0.763  0.4705
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F  p-value
## s(log2_age)  1.499  1.763 55.394  < 2e-16
## s(FAM)      47.256 81.000  1.484 6.47e-05
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error     t value
## (Intercept)                5.88413783 0.23765951 24.75868852
## GRPVG                      0.01087846 0.17804111  0.06110086
## GRPVN                     -0.15782335 0.14864408 -1.06175337
## SEXM                      -0.03325919 0.09234161 -0.36017554
## aBreastFeed_full_stopped   0.42570075 0.27532899  1.54615300
## aBreastFeed_full_duration -0.05608753 0.02341901 -2.39495729
## aBreastFeed_total_stopped -0.25185425 0.15065679 -1.67170865
## log2_age                   0.60795539 0.06439429  9.44113795


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(growth factor) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(growth factor) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.1578234 -0.4491604 0.1335137 0.2883477
VG vs OM 0.0108785 -0.3380757 0.3598326 0.9512789
VN vs VG -0.1687018 -0.4893801 0.1519764 0.3024966

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)
## ICC
icc(mod_main)
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.513
##   Unadjusted ICC: 0.228
i = i+1 

2.2.37 aUr_Ca

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
    !is.na(aSUP_Ca))

column_name
## [1] "aUr_Ca"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca + 
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                1.93548    0.96175   2.012    0.048 *
## aBreastFeed_full_stopped  -0.39001    0.94523  -0.413    0.681  
## aBreastFeed_full_duration  0.02591    0.06359   0.407    0.685  
## aBreastFeed_total_stopped -0.25973    0.46064  -0.564    0.575  
## SEXM                       0.18011    0.26687   0.675    0.502  
## GRPVG                      0.20225    0.44385   0.456    0.650  
## GRPVN                     -0.22058    0.39419  -0.560    0.578  
## aSUP_Ca                   -0.42605    0.70860  -0.601    0.550  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.819  3.426 2.103 0.160585    
## s(FAM)      42.690 85.000 1.085 0.000526 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.47   Deviance explained = 69.6%
## GCV = 2.2338  Scale est. = 1.2699    n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.170   0.681
## aBreastFeed_full_duration  1 0.166   0.685
## aBreastFeed_total_stopped  1 0.318   0.575
## SEX                        1 0.455   0.502
## GRP                        2 0.605   0.549
## aSUP_Ca                    1 0.362   0.550
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.819  3.426 2.103 0.160585
## s(FAM)      42.690 85.000 1.085 0.000526
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca + 
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)
## (Intercept)                0.83789    0.87322   0.960    0.341
## aBreastFeed_full_stopped  -0.72358    0.84789  -0.853    0.397
## aBreastFeed_full_duration  0.05220    0.05755   0.907    0.368
## aBreastFeed_total_stopped -0.64115    0.42168  -1.520    0.133
## SEXM                       0.16781    0.24288   0.691    0.492
## GRPVG                      0.09157    0.42499   0.215    0.830
## GRPVN                     -0.35780    0.37579  -0.952    0.345
## aSUP_Ca                   -0.54204    0.64692  -0.838    0.405
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.593   3.14 2.140    0.135    
## s(FAM)      49.494  85.00 1.505 5.13e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.559   Deviance explained = 77.1%
## GCV = 1.8447  Scale est. = 0.95081   n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 0.728   0.397
## aBreastFeed_full_duration  1 0.823   0.368
## aBreastFeed_total_stopped  1 2.312   0.133
## SEX                        1 0.477   0.492
## GRP                        2 0.889   0.416
## aSUP_Ca                    1 0.702   0.405
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.593  3.140 2.140    0.135
## s(FAM)      49.494 85.000 1.505 5.13e-05
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, include = c('aSUP_Ca'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error     t value
## (Intercept)                0.94727737 0.65176625  1.45340046
## GRPVG                      0.03656867 0.48909087  0.07476867
## GRPVN                     -0.42404076 0.42960686 -0.98704373
## SEXM                       0.21814678 0.25089877  0.86946135
## aBreastFeed_full_stopped  -1.01984707 0.71840627 -1.41959657
## aBreastFeed_full_duration  0.04949519 0.06008905  0.82369737
## aBreastFeed_total_stopped -0.98089082 0.42750983 -2.29442870
## log2_age                   0.29158661 0.19068875  1.52912328
## aSUP_Ca                   -0.77805314 0.68394924 -1.13758900


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(urinary Ca) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(urinary Ca) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.4240408 -1.2660547 0.4179732 0.3236212
VG vs OM 0.0365687 -0.9220318 0.9951692 0.9403988
VN vs VG -0.4606094 -1.2944359 0.3732171 0.2789448

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Ca'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Ca'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Ca'), exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Ca'),
                exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(include = c('aSUP_Ca'), 
                  exclude = c('GRP'))

### model without other cov.
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Ca'),
                     exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

i <- i+1

2.2.38 aCa_per_Krea

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome),
    !is.na(aSUP_Ca))

column_name
## [1] "aCa_per_Krea"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.253595   0.243122   5.156 2.14e-06 ***
## aBreastFeed_full_stopped  -0.801508   0.231360  -3.464   0.0009 ***
## aBreastFeed_full_duration  0.020886   0.019600   1.066   0.2902    
## aBreastFeed_total_stopped -0.131031   0.134074  -0.977   0.3317    
## SEXM                       0.004802   0.080493   0.060   0.9526    
## GRPVG                     -0.057879   0.136382  -0.424   0.6725    
## GRPVN                     -0.076396   0.120610  -0.633   0.5285    
## aSUP_Ca                    0.199352   0.218462   0.913   0.3645    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value   
## s(log2_age)  1.00      1 1.192 0.27847   
## s(FAM)      43.17     85 0.943 0.00407 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.539   Deviance explained = 73.1%
## GCV = 0.20892  Scale est. = 0.12102   n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + s(log2_age) + 
##     s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df      F p-value
## aBreastFeed_full_stopped   1 12.002  0.0009
## aBreastFeed_full_duration  1  1.135  0.2902
## aBreastFeed_total_stopped  1  0.955  0.3317
## SEX                        1  0.004  0.9526
## GRP                        2  0.203  0.8164
## aSUP_Ca                    1  0.833  0.3645
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age)  1.00   1.00 1.192 0.27847
## s(FAM)      43.17  85.00 0.943 0.00407
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    aSUP_Ca +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)   
## (Intercept)               -1.40482    0.43536  -3.227  0.00186 **
## aBreastFeed_total_stopped -0.62344    0.41161  -1.515  0.13411   
## SEXM                      -0.01042    0.24324  -0.043  0.96595   
## GRPVG                      0.28534    0.39479   0.723  0.47210   
## GRPVN                      0.02790    0.35445   0.079  0.93746   
## aSUP_Ca                    0.40476    0.65072   0.622  0.53584   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.118  2.546 1.837 0.172038    
## s(FAM)      41.584 85.000 1.087 0.000279 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.506   Deviance explained = 70.2%
## GCV = 1.8452  Scale est. = 1.1056    n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_total_stopped + SEX + GRP + aSUP_Ca + 
##     s(log2_age) + s(FAM, bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_total_stopped  1 2.294   0.134
## SEX                        1 0.002   0.966
## GRP                        2 0.347   0.708
## aSUP_Ca                    1 0.387   0.536
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.118  2.546 1.837 0.172038
## s(FAM)      41.584 85.000 1.087 0.000279
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, include = c('aSUP_Ca'), remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                               Estimate Std. Error      t value
## (Intercept)               -0.076536500 0.66336003 -0.115377014
## GRPVG                      0.190229336 0.43305270  0.439275258
## GRPVN                      0.024865529 0.38476039  0.064626010
## SEXM                      -0.001049997 0.26840706 -0.003911958
## aBreastFeed_full_stopped  -1.442484124 0.77340047 -1.865119285
## aBreastFeed_full_duration  0.025411915 0.06572311  0.386651151
## aBreastFeed_total_stopped -0.646864841 0.44424712 -1.456092382
## log2_age                  -0.063253980 0.19856893 -0.318549237
## aSUP_Ca                    0.557720554 0.72542534  0.768818683


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(uCa:uCreatinie ratio) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(uCa:uCreatinie ratio) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM 0.0248655 -0.7292510 0.778982 0.9484718
VG vs OM 0.1902293 -0.6585384 1.038997 0.6604621
VN vs VG -0.1653638 -0.9134796 0.582752 0.6648468

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme(include = c('aSUP_Ca'))
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(include = c('aSUP_Ca'), remove_random = TRUE)

### model without sex
mod_nsex <- rlme(include = c('aSUP_Ca'), exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(include = c('aSUP_Ca'),
                exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(include = c('aSUP_Ca'), 
                  exclude = c('GRP'))

### model without other cov.
mod_other_cov <- rlme()

### model without log2_age
mod_log2_age <- rlme(include = c('aSUP_Ca'),
                     exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

i <- i+1

2.2.39 aI_per_Krea

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aI_per_Krea"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1641.80     378.31   4.340 3.34e-05 ***
## aBreastFeed_full_stopped  -1073.17     379.75  -2.826  0.00566 ** 
## aBreastFeed_full_duration    -1.69      34.15  -0.049  0.96062    
## aBreastFeed_total_stopped    20.00     215.44   0.093  0.92623    
## SEXM                         51.35     133.75   0.384  0.70181    
## GRPVG                      -138.59     184.42  -0.751  0.45407    
## GRPVN                      -191.27     167.39  -1.143  0.25585    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F p-value
## s(log2_age) 1.00e+00      1 0.337   0.563
## s(FAM)      3.94e-09     77 0.000   0.534
## 
## R-sq.(adj) =  0.134   Deviance explained = 18.9%
## GCV = 5.2302e+05  Scale est. = 4.8533e+05  n = 111
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 7.986 0.00566
## aBreastFeed_full_duration  1 0.002 0.96062
## aBreastFeed_total_stopped  1 0.009 0.92623
## SEX                        1 0.147 0.70181
## GRP                        2 0.665 0.51651
## 
## Approximate significance of smooth terms:
##                  edf   Ref.df     F p-value
## s(log2_age) 1.00e+00 1.00e+00 0.337   0.563
## s(FAM)      3.94e-09 7.70e+01 0.000   0.534
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               10.06092    0.68073  14.780   <2e-16 ***
## aBreastFeed_full_stopped  -1.60706    0.67791  -2.371   0.0198 *  
## aBreastFeed_full_duration  0.03002    0.05653   0.531   0.5966    
## aBreastFeed_total_stopped -0.06920    0.36187  -0.191   0.8488    
## SEXM                       0.12638    0.22277   0.567   0.5718    
## GRPVG                     -0.33993    0.31632  -1.075   0.2853    
## GRPVN                     -0.46948    0.28494  -1.648   0.1027    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value
## s(log2_age) 1.434  1.752 1.414   0.155
## s(FAM)      7.689 75.000 0.117   0.267
## 
## R-sq.(adj) =  0.253   Deviance explained = 35.6%
## GCV = 1.4416  Scale est. = 1.2322    n = 111
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 5.620  0.0198
## aBreastFeed_full_duration  1 0.282  0.5966
## aBreastFeed_total_stopped  1 0.037  0.8488
## SEX                        1 0.322  0.5718
## GRP                        2 1.378  0.2571
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F p-value
## s(log2_age)  1.434  1.752 1.414   0.155
## s(FAM)       7.689 75.000 0.117   0.267
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation improved the fit. We will continue to work with log-transformed values.

Fit main model

Open code

dat_mod$outcome <- log2(dat_mod$outcome) 
column_name <- paste0('log2_', column_name)

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error     t value
## (Intercept)               10.30038788 0.52224027 19.72346533
## GRPVG                     -0.30623789 0.31005589 -0.98768609
## GRPVN                     -0.37843672 0.28143687 -1.34465937
## SEXM                       0.11919881 0.22487136  0.53007556
## aBreastFeed_full_stopped  -1.51688347 0.63846764 -2.37581888
## aBreastFeed_full_duration  0.03624065 0.05741881  0.63116326
## aBreastFeed_total_stopped -0.02027727 0.36221975 -0.05598059
## log2_age                  -0.34291254 0.16105193 -2.12920475


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of log2(uIodine:uCreatinine ratio) level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of log2(uIodine:uCreatinine ratio) level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -0.3784367 -0.9300429 0.1731694 0.1787353
VG vs OM -0.3062379 -0.9139363 0.3014605 0.3233064
VN vs VG -0.0721988 -0.6283935 0.4839958 0.7991710

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)
### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

i <- i+1

2.2.40 aP_per_Krea

Data selection and diagnostic plots

Open code

column_name <- names(dat_child_all)[i+ni]

dat_mod <- dat_child_all %>%
  mutate(outcome = !!sym(column_name)) %>% 
  filter(!is.na(outcome))

column_name
## [1] "aP_per_Krea"

gamm <- gam(
  outcome ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.925357   1.132349   1.700 0.093669 .  
## aBreastFeed_full_stopped   1.881968   1.111186   1.694 0.094935 .  
## aBreastFeed_full_duration  0.001132   0.073961   0.015 0.987830    
## aBreastFeed_total_stopped  1.133605   0.540213   2.098 0.039606 *  
## SEXM                       0.064015   0.310388   0.206 0.837223    
## GRPVG                     -0.632002   0.530655  -1.191 0.237826    
## GRPVN                     -1.901702   0.467373  -4.069 0.000126 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.924  3.535 7.237 0.000129 ***
## s(FAM)      46.491 85.000 1.238 0.000293 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.64   Deviance explained = 80.2%
## GCV = 3.0103  Scale est. = 1.6407    n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## outcome ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df     F p-value
## aBreastFeed_full_stopped   1 2.868 0.09493
## aBreastFeed_full_duration  1 0.000 0.98783
## aBreastFeed_total_stopped  1 4.403 0.03961
## SEX                        1 0.043 0.83722
## GRP                        2 9.258 0.00028
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.924  3.535 7.237 0.000129
## s(FAM)      46.491 85.000 1.238 0.000293
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again with log2-transformed outcome

Open code

gamm <- gam(
  log2(outcome) ~ 
    aBreastFeed_full_stopped +
    aBreastFeed_full_duration +
    aBreastFeed_total_stopped + 
    SEX + 
    GRP +
    s(log2_age) +
    s(FAM, bs='re'), 
  data = dat_mod)

summary(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                0.640566   0.536544   1.194   0.2365    
## aBreastFeed_full_stopped   1.232943   0.528545   2.333   0.0225 *  
## aBreastFeed_full_duration -0.007906   0.035387  -0.223   0.8238    
## aBreastFeed_total_stopped  0.341739   0.255868   1.336   0.1859    
## SEXM                      -0.010679   0.147962  -0.072   0.9427    
## GRPVG                     -0.246866   0.246555  -1.001   0.3201    
## GRPVN                     -0.939780   0.217520  -4.320 4.95e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value    
## s(log2_age)  2.897  3.518 7.033 0.000160 ***
## s(FAM)      42.828 85.000 1.041 0.000937 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.631   Deviance explained = 78.6%
## GCV = 0.68555  Scale est. = 0.39406   n = 124
anova(gamm)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log2(outcome) ~ aBreastFeed_full_stopped + aBreastFeed_full_duration + 
##     aBreastFeed_total_stopped + SEX + GRP + s(log2_age) + s(FAM, 
##     bs = "re")
## 
## Parametric Terms:
##                           df      F  p-value
## aBreastFeed_full_stopped   1  5.442   0.0225
## aBreastFeed_full_duration  1  0.050   0.8238
## aBreastFeed_total_stopped  1  1.784   0.1859
## SEX                        1  0.005   0.9427
## GRP                        2 10.949 7.14e-05
## 
## Approximate significance of smooth terms:
##                edf Ref.df     F  p-value
## s(log2_age)  2.897  3.518 7.033 0.000160
## s(FAM)      42.828 85.000 1.041 0.000937
plot(gamm, select = 1)

Open code

### Model check
pltmd(gamm)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Log-transformation did not improve the fit. We choose is original scale

Fit main model

Open code

## main model 
mod_main <- run(
   rlme(main = TRUE, remove_random = FALSE),
  path = paste0('gitignore/run/mod_child_all_', column_name, '_main_mixef'),
  reuse = TRUE)


summary(mod_main)[['coefficients']]
##                              Estimate Std. Error     t value
## (Intercept)                1.28639520 0.78634131  1.63592472
## GRPVG                     -0.63276234 0.45099825 -1.40302614
## GRPVN                     -2.11710046 0.40853656 -5.18215665
## SEXM                       0.11351050 0.32533097  0.34890777
## aBreastFeed_full_stopped   4.14024556 0.95401253  4.33982306
## aBreastFeed_full_duration -0.00623018 0.08080801 -0.07709854
## aBreastFeed_total_stopped  1.54665232 0.53104555  2.91246641
## log2_age                  -1.18244142 0.22876662 -5.16876722


res <- emm(mod_main)

suppl_table <- kbl(res, caption = 
      'Results of linear mixed-effects model of `uP:uCreatinine ratio` level. CI_L and CI_U are bounds of 95% confidence interval') %>% 
  kable_styling('striped',full_width = T) %>% 
  column_spec(1, width_min = '1in')

suppl_table
Results of linear mixed-effects model of `uP:uCreatinine ratio` level. CI_L and CI_U are bounds of 95% confidence interval
Estimate CI-L CI-U P
VN vs OM -2.1171005 -2.917817 -1.3163835 0.0000002
VG vs OM -0.6327623 -1.516703 0.2511780 0.1606090
VN vs VG -1.4843381 -2.287226 -0.6814507 0.0002907

Leave-one-factor mixed models

Open code

### main model but non-robust
mod_main <- rlme()
res2 <- emm(mod_main)

### model  without random effect
mod_nonran <- rlme(remove_random = TRUE)

### model without sex
mod_nsex <- rlme(exclude = c('SEX'))

### model without breastfeeding-related predictors
mod_nbf <- rlme(exclude = c('aBreastFeed_full_stopped',
                            'aBreastFeed_total_stopped',
                            'aBreastFeed_full_duration'))

### model without diet groups
mod_ndiet <- rlme(exclude = c('GRP'))

### model without other cov.
mod_other_cov <- NA

### model without log2_age
mod_log2_age <- rlme(exclude = c('log2_age'))

Putting key results together

Open code

## AIC

AIC_child_all <- add_AIC(AIC_child_all, mixef = TRUE)

diet_child_all <- add_eff(diet_child_all, res, mixef = TRUE); diet_child_all_non_robust <- add_eff(diet_child_all_non_robust, res2, mixef = TRUE)

2.2.41 Save tables

Open code

## Between-diets differences

if(file.exists('gitignore/data/diet_child_all_mixeff.xlsx') == FALSE){
  diet_child_all <- diet_child_all[-1,]
  
  diet_child_all$VN_OM_P_adj <- p.adjust(diet_child_all$VN_OM_P, method = 'fdr')
  diet_child_all$VG_OM_P_adj <- p.adjust(diet_child_all$VG_OM_P, method = 'fdr')
  diet_child_all$VN_VG_P_adj <- p.adjust(diet_child_all$VN_VG_P, method = 'fdr')
  
# for (i in seq(1, nrow(diet_child_all))) {
#   padj <- p.adjust(c(
#     diet_child_all$VN_OM_P[i], 
#     diet_child_all$VG_OM_P[i],
#     diet_child_all$VN_VG_P[i]), 
#     method = 'hochberg')
#   
#   diet_child_all$VN_OM_P_adj[i] <- padj[1]
#   diet_child_all$VG_OM_P_adj[i] <- padj[2]
#   diet_child_all$VN_VG_P_adj[i] <- padj[3]
# }
  
  diet_child_all <- diet_child_all %>%
    
    select(outcome, estimand, 
         VN_OM_diff, VN_OM_P, VN_OM_P_adj,
         VG_OM_diff, VG_OM_P, VG_OM_P_adj,
         VN_VG_diff, VN_VG_P, VN_VG_P_adj) %>%
    
    mutate(across(.cols = 3:11, .fns = ~as.numeric(as.character(.)))) 

  write.xlsx(diet_child_all, 'gitignore/data/diet_child_all_mixeff.xlsx')
}


if(file.exists('gitignore/data/diet_child_all_mixeff_non_robust.xlsx') == FALSE){
  diet_child_all_non_robust <- diet_child_all_non_robust[-1,]
  
  diet_child_all_non_robust$VN_OM_P_adj <- p.adjust(
    diet_child_all_non_robust$VN_OM_P, method = 'fdr')
  
  diet_child_all_non_robust$VG_OM_P_adj <- p.adjust(
    diet_child_all_non_robust$VG_OM_P, method = 'fdr')
  
  diet_child_all_non_robust$VN_VG_P_adj <- p.adjust(
    diet_child_all_non_robust$VN_VG_P, method = 'fdr')
  
# for (i in seq(1, nrow(diet_child_all_non_robust))) {
#   padj <- p.adjust(c(
#     diet_child_all_non_robust$VN_OM_P[i], 
#     diet_child_all_non_robust$VG_OM_P[i],
#     diet_child_all_non_robust$VN_VG_P[i]), 
#     method = 'hochberg')
#   
#   diet_child_all_non_robust$VN_OM_P_adj[i] <- padj[1]
#   diet_child_all_non_robust$VG_OM_P_adj[i] <- padj[2]
#   diet_child_all_non_robust$VN_VG_P_adj[i] <- padj[3]
# }
  
  diet_child_all_non_robust <- diet_child_all_non_robust %>%
    
    select(outcome, estimand, 
         VN_OM_diff, VN_OM_P, VN_OM_P_adj,
         VG_OM_diff, VG_OM_P, VG_OM_P_adj,
         VN_VG_diff, VN_VG_P, VN_VG_P_adj) %>%
    
    mutate(across(.cols = 3:11, .fns = ~as.numeric(as.character(.)))) 

  write.xlsx(diet_child_all_non_robust, 'gitignore/data/diet_child_all_mixeff_non_robust.xlsx')
}

## Importance of predictors

if(file.exists('gitignore/data/AIC_child_all_mixeff.xlsx') == FALSE){
  AIC_child_all <- AIC_child_all[-1,]
  write.xlsx(AIC_child_all, 'gitignore/data/AIC_child_all_mixeff.xlsx')
  }

3 Reproducibility

Open code
sessionInfo()
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=cs_CZ.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Prague
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] mice_3.17.0        patchwork_1.2.0    ggrepel_0.9.5      robustlmm_3.3-1   
##  [5] gridExtra_2.3      pheatmap_1.0.12    performance_0.12.2 quantreg_5.98     
##  [9] SparseM_1.81       bayesplot_1.8.1    ggdist_3.3.2       kableExtra_1.4.0  
## [13] lubridate_1.8.0    corrplot_0.92      arm_1.12-2         MASS_7.3-64       
## [17] projpred_2.0.2     glmnet_4.1-8       boot_1.3-31        cowplot_1.1.1     
## [21] pROC_1.18.0        mgcv_1.9-1         nlme_3.1-167       openxlsx_4.2.5    
## [25] flextable_0.9.6    sjPlot_2.8.16      car_3.1-2          carData_3.0-5     
## [29] gtsummary_2.0.2    emmeans_1.10.4     ggpubr_0.4.0       lme4_1.1-35.5     
## [33] Matrix_1.7-0       forcats_1.0.0      stringr_1.5.1      dplyr_1.1.4       
## [37] purrr_1.0.2        readr_2.1.2        tidyr_1.3.1        tibble_3.2.1      
## [41] ggplot2_3.5.1      tidyverse_1.3.1   
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.4.3           later_1.3.0             gamm4_0.2-6            
##   [4] cellranger_1.1.0        datawizard_0.12.2       rpart_4.1.24           
##   [7] reprex_2.0.1            lifecycle_1.0.4         rstatix_0.7.0          
##  [10] lattice_0.22-5          insight_0.20.2          backports_1.5.0        
##  [13] magrittr_2.0.3          rmarkdown_2.27          yaml_2.3.5             
##  [16] httpuv_1.6.5            zip_2.2.0               askpass_1.1            
##  [19] DBI_1.1.2               minqa_1.2.4             RColorBrewer_1.1-2     
##  [22] multcomp_1.4-18         abind_1.4-5             rvest_1.0.2            
##  [25] nnet_7.3-20             TH.data_1.1-0           sandwich_3.0-1         
##  [28] gdtools_0.3.7           pbkrtest_0.5.1          crul_1.5.0             
##  [31] MatrixModels_0.5-3      svglite_2.1.3           codetools_0.2-19       
##  [34] xml2_1.3.3              tidyselect_1.2.1        shape_1.4.6            
##  [37] farver_2.1.0            ggeffects_1.7.0         httpcode_0.3.0         
##  [40] matrixStats_1.3.0       jsonlite_1.8.8          mitml_0.4-3            
##  [43] ellipsis_0.3.2          ggridges_0.5.3          survival_3.7-0         
##  [46] iterators_1.0.14        systemfonts_1.0.4       foreach_1.5.2          
##  [49] tools_4.4.3             ragg_1.2.1              Rcpp_1.0.13            
##  [52] glue_1.7.0              pan_1.6                 xfun_0.46              
##  [55] distributional_0.4.0    loo_2.4.1               withr_3.0.1            
##  [58] fastmap_1.2.0           fansi_1.0.6             openssl_1.4.6          
##  [61] digest_0.6.37           R6_2.5.1                mime_0.12              
##  [64] estimability_1.5.1      textshaping_0.3.6       colorspace_2.0-2       
##  [67] utf8_1.2.4              generics_0.1.3          fontLiberation_0.1.0   
##  [70] data.table_1.15.4       robustbase_0.93-9       httr_1.4.2             
##  [73] htmlwidgets_1.6.4       pkgconfig_2.0.3         gtable_0.3.0           
##  [76] htmltools_0.5.8.1       fontBitstreamVera_0.1.1 scales_1.3.0           
##  [79] knitr_1.48              rstudioapi_0.16.0       tzdb_0.2.0             
##  [82] uuid_1.0-3              coda_0.19-4             curl_4.3.2             
##  [85] nloptr_2.0.0            zoo_1.8-9               sjlabelled_1.2.0       
##  [88] parallel_4.4.3          pillar_1.9.0            vctrs_0.6.5            
##  [91] promises_1.2.0.1        jomo_2.7-3              dbplyr_2.1.1           
##  [94] xtable_1.8-4            evaluate_1.0.0          fastGHQuad_1.0.1       
##  [97] mvtnorm_1.1-3           cli_3.6.3               compiler_4.4.3         
## [100] rlang_1.1.4             crayon_1.5.0            rstantools_2.1.1       
## [103] ggsignif_0.6.3          labeling_0.4.2          modelr_0.1.8           
## [106] plyr_1.8.6              sjmisc_2.8.10           fs_1.6.4               
## [109] stringi_1.7.6           viridisLite_0.4.0       assertthat_0.2.1       
## [112] munsell_0.5.0           fontquiver_0.2.1        sjstats_0.19.0         
## [115] hms_1.1.1               gfonts_0.2.0            shiny_1.9.1            
## [118] highr_0.11              haven_2.4.3             broom_1.0.6            
## [121] DEoptimR_1.0-10         readxl_1.3.1            officer_0.6.6